Cloud-native open storage format for quantitative biological dynamics data
Koji Kyoda1, Kenneth H.L. Ho2, and Shuichi Onami1
1RIKEN BDR, 2Francis Crick Institute
2022/04/12
Bioimaging data ecosystem
Swedlow and Onami
Global BioImaging EoE V. 2020
Sharing of bioimaging data
SSBD:database
SSBD:repository
Tohsato et al. (2016) Bioinformatics
Open and unified data format
Data providers
Tool developers
All tools can be used for data analysis.
All data can be used for tool evaluation.
An open unified format
BDML: Biological Dynamics Markup Language
Kyoda et al. (2015) Bioinformatics
<scaleUnit>
<tScale>20</tScale>
<tUnit>second</tUnit>
</scaleUnit>
<component>
<componentID>100</componentID>
<time>1</time>
<measurement>
<point><xyz><x>10.32</x><y>30.42</y><z>18.32</z></xyz></point>
</measurement>
</component>
<component>
<componentID>101</componentID>
<time>2</time>
<prevID>100</prevID>
<measurement>
<point><xyz><x>9.57</x><y>32.05</y><z>14.91</z></xyz></point>
</measurement>
</component>
point
line
face
circle
sphere
BD5
Kyoda et al. (2020) PLoS One
OME-NGFF
(Moore et al., Nat. Methods, 2021)
Image set
metadata
array data
for multiscale
masks
(segmentation data)
Formats for bioimaging data
image data
quantitative data
OME-TIFF
ome-zarr
BDML
BD5
in the cloud
in a data center
?
Overview
ome-zarr
BDZ
S3
Phenotype analysis
New analytical methods
Synchronous visualization
bioimaging data
object storage
Spatial Omics data in OME-NGFF
https://forum.image.sc/t/ome-ngff-spatial-omics-hackathon/57337
from squidpy tutorial
AnnData
.csv
.tsv
.loom
…
.h5ad
.zarr
[https://anndata.readthedocs.io]
Proposal: How to store quantitative data
images
low-level
detection
representative
position
tracking info.
OME-NGFF (with Labels)
AnnData-style
t, z, y, x
ID,
feature (volume, etc.)
centroid
Labels
t, z, y, x
ID, radius,
feature
X
obs
matrix
matrix
features
Bao et al. (2006)
AnnData-style representation for dynamics data
t | z | y | x |
1.0 | 2.4 | 2.1 | 3.2 |
2.0 | 3.4 | 2.5 | 3.3 |
3.0 | 3.2 | 2.6 | 3.1 |
ID | entity | signal |
1001 | point | 4.5 |
1002 | point | 4.5 |
1003 | point | 4.6 |
| 1001 | 1002 | 1003 |
1001 | 0 | 1 | 0 |
1002 | 0 | 0 | 1 |
1003 | 0 | 0 | 0 |
to
from
Pixel-based ROI data
image
labels
OME-NGFF structure
image data
labels data
t | z | y | x |
1.0 | 2.4 | 2.1 | 3.2 |
2.0 | 3.4 | 2.5 | 3.3 |
3.0 | 3.2 | 2.6 | 3.1 |
ID | entity | signal |
1001 | point | 4.5 |
1002 | point | 4.5 |
1003 | point | 4.6 |
AnnData
X
obs
.zattr
Example of BDZ (line entity)
wt-N2-081015-01
|
|--- 0
| |
| |--t
| |--c
| |--z
| |--y
| |--x
|
|--- labels
| |
| |--0
| |
| |--t
| |--...
|
|--- dyn
|
|-- X
|
|-- obs
|
|-- obsm
image data
Pixel-based ROI data
Dynamics data
X
obs
obsp
position data
feature data
tracking data
Example of BDZ (sphere entity)
0801505_L1
|
|--- dyn
|
|-- X
|
|-- obs
|
|-- obsm
Dynamics data
X
obs
obsp
tracking data
position data
feature data
Data visualization
Future plan
Summary
Acknowledgement