Skip to content

The EM image dataset

Jasper Phelps edited this page Jan 19, 2025 · 10 revisions

This phase of the project was led by Jasper Phelps and Minsu Kim.

Dataset generation

  • The sample (see Sample choice and tissue preparation) was cut into 7010 serial sections, each around 45nm in thickness, and collected onto a single 7500-slot reel of GridTape.
  • These tissue sections were imaged with a custom GridTape-compatible transmission electron microscope at a lateral resolution of about 4nm. The size of each voxel is therefore about 4nm x 4nm x 45nm. Interested readers can find additional info on Details on imaging.
  • The images were initially aligned into a 3D volume with a pipeline based on AlignTK, then alignment was refined with a deep learning approach developed by Zetta AI.

General geometry

  • The brain can be seen in z indices 0 through 6206.
  • The ventral nerve cord can be seen in z indices 1438 through 7009.
    • Peripheral nerves exiting the ventral nerve cord can be seen down to z index 640.
  • The neck connective can be seen in z indices ~2350 through ~3280.
  • z index 0 shows the most ventral part of the nervous system and z index 7009 shows the most dorsal part.

Dataset size

Raw count of images / terabytes

Each raw image from the microscope is 2048x2048 pixels * 16-bits per pixel -> 8.4 megabytes per image.

  • 49.7 million images (416 terabytes) were captured (average of ~7000 images from each of the ~7000 sections) using the electron microscope.
    • This total excludes a few million additional images that were captured during imaging runs that were not included in the final dataset.
  • 28.8 million images (58% of the captured images) were used in the stitching process (the process of determining how the few thousand images for each section should be stitched/montaged).
    • The other 42% of captured images were excluded for being outside the convex hull of the tissue – see red regions in example image below.
  • 18.6 million images (37.5% of the captured images) were used for rendering the final dataset.
    • The other 20.5% of the captured images were used in stitching but excluded from rendering for not containing tissue – see blue regions in the example image below.

Size of final aligned dataset

  • TODO number of voxels in final aligned dataset
  • TODO number of terabytes in final (uncompressed or compressed?) dataset

Missing data

  • 10 of the total 7010 z indices (0.14%) have no data due to the section being lost: z indices 856, 885, 3755, 5746, 5772, 5778, 5793, 5801, 5822, 5869
    • One of these losses (z=3755) was due to the section being collected onto the wrong location on the GridTape (not over the slot) and so it couldn't be imaged with transmission EM.
    • The other 9 losses were due to the support film rupturing after section collection but before the section could be imaged.
  • An additional 26 z indices (0.37%) have partial data loss
    • 7 z indices are missing all images for the brain: 914, 1462, 5841, 5849, 5888, 5896, 5916
    • 7 z indices are missing all images for the VNC: 874, 2784, 2822, 3064, 3102, 4566, 5840
    • 12 z indices are missing a fraction of brain and/or VNC images: 2828, 2860, 2912, 2986, 3054, 3080, 3586, 3605, 3833, 4648, 4768, 5935
  • The remaining 6974 z indices (99.5%) were collected and imaged successfully with no data loss.

Alignment to microCT scan

Slicing tissue into thin sections and then computationally aligning the sections back together often produces some distortions, i.e. the shape of an aligned EM dataset might not exactly match the shape of the original intact tissue. To determine whether any significant distortions were introduced during the BANC's alignment, we compared the shape of the aligned EM dataset to the shape of the intact tissue as captured by an X-ray scan.

Here you can view the microCT scan's shape (yellow) affine aligned to the EM dataset's shape (blue). As you can see, they overlay very well on one another, indicating that the aligned EM dataset reproduces the true shape of the tissue quite well.