API code examples

Usage of the SegmentationDataset property caches

SegmentationObjects retrieved through the factory method SegmentationDataset.get_segmentation_object will not load the object’s attribute dictionary by default in order to reduce IO overhead. E.g. using the factory method can be convenient to access the voxels or meshes of many objects, but this does not require any properties from the attribute dictionary.

In [77]: from syconn.reps.super_segmentation import *                                                                 

In [78]: global_params.wd = "/wholebrain/scratch/pschuber/SyConn/example_cube/"                                      

In [79]: sd = SegmentationDataset('sv')                                                                               

# choose any supervoxel, here the first in the ids array
In [80]: sv = sd.get_segmentation_object(sd.ids[0])                                                                   

In [81]: sv.attr_dict                                                                                                 
Out[81]: {}

The properties stored in SegmentationObject.attr_dict are cached as numpy arrays and stored on disk in the SegmentationDataset folder during dataset_analysis. Many, sparse look-ups for specific properties can be realized by loading the cache arrays via load_cached_data and using SegmentationDataset.ids to retrieve the index of the object of interest in the property cache:

In [82]: prop_look_up = dict(size=sd.load_numpy_data('size'), rep_coord=sd.load_numpy_data('rep_coord'))

In [83]: prop_look_up['rep_coord'][np.where(sd.ids == sv.id)]                                                         
Out[83]: array([[1475, 1297,  882]], dtype=int32)

In [84]: prop_look_up['size'][np.where(sd.ids == sv.id)]                                                              
Out[84]: array([1750])

Another solution with similar complexity but a more convenient interface is to pass the property keys during the init. of SegmentationDataset. Now get_segmentation_object will use a lookup from object ID to the index in the respective property numpy array to populate the object’s attribute dictionary:

In [85]: sd = SegmentationDataset('sv', cache_properties=('rep_coord', 'size'))                                       

In [86]: sv = sd.get_segmentation_object(sd.ids[0])                                                                   

In [87]: sv.attr_dict                                                                                                 
Out[87]: {'rep_coord': array([1475, 1297,  882], dtype=int32), 'size': 1750}

This mechanism allows quick access to specific attributes instead of loading the entire dictionary from file for every single object. Besides a considerable reduction of file reads this also avoids to read properties which are not of interest for the current process, which can be quite many:

In [88]: sv.load_attr_dict()                                                                                          

In [89]: sv.attr_dict                                                                                                 
Out[89]: 
{'mapping_mi_ids': [],
 'mapping_mi_ratios': [],
 'mapping_sj_ids': [],
 'mapping_sj_ratios': [],
 'mapping_vc_ids': [],
 'mapping_vc_ratios': [],
 'rep_coord': array([1475, 1297,  882], dtype=int32),
 'bounding_box': array([[1475, 1297,  881],
        [1492, 1351,  898]], dtype=int32),
 'size': 1750,
 'mesh_bb': [array([14700.   , 12940.411, 17632.592], dtype=float32),
  array([14875.668, 13468.265, 17934.43 ], dtype=float32)],
 'mesh_area': 0.194492}

In addition, the properties which are assigned to every SegmentationObject instance are also available to be used for further processing. For example, the cache-enabled SegmentationDataset can be passed to SuperSegmentationDataset via the sd_lookup kwarg, which will be passed on to SuperSegmentationObjects instantiated through the factory method SuperSegmentationDataset.get_supersegmentation_object. Calls to sub-cellular structures of neurons, e.g. synapses via SuperSegmentationObject.syn_ssv or mitochondria via SuperSegmentationObject.mis will contain the properties cache_properties. The method SuperSegmentationObject.typedsyns2mesh needs access to the synaptic sign of all the cell’s synapses, which requires zero file reads when enabling the cache (meshes still have to be loaded of course):

In [50]: sd_syn_ssv = SegmentationDataset('syn_ssv', cache_properties=('syn_sign', ))

In [51]: ssd = SuperSegmentationDataset(sd_lookup=dict(syn_ssv=sd_syn_ssv))

# choose any, here the first in the ssv_id array, cell reconstruction
In [52]: ssv = ssd.get_super_segmentation_object(ssd.ssv_ids[0])

In [53]: ssv.typedsyns2mesh() 

In case all data points are of interest, the recommended way to use the cache is via numpy array indexing:

In [89]: sd = SegmentationDataset('sv')

# load the mesh bounding box
In [90]: mesh_bb = sd.load_numpy_data('mesh_bb')  # N, 2, 3

# calculate the diagonal
In [91]: mesh_bb = np.linalg.norm(mesh_bb[:, 1] - mesh_bb[:, 0], axis=1)                                              

# get supervoxels with a bounding box diagonal above 5um
In [92]: filtered_ids = sd.ids[mesh_bb > 5e3]                                                                         

# percentage of supervoxels above threshold
In [93]: len(filtered_ids) / len(sd.ids)                                                                              
Out[93]: 0.0632114971144053

# Get the coordinates of the supervoxels above and the size in voxels of those below
In [94]: mask = mesh_bb > 5e3                                                                                         

In [95]: sv_coords_of_interest = sd.rep_coords[mask]                                                                  

In [96]: sv_sizes_vx_filtered = sd.sizes[~mask] 

Myelin prediction

The entire myelin prediction for a single cell reconstruction including a smoothing is implemented and can be manually invoked as follows::

from syconn import global_params
from syconn.reps.super_segmentation import *
from syconn.reps.super_segmentation_helper import map_myelin2coords, majorityvote_skeleton_property

# init. example data set
global_params.wd = '~/SyConn/example_cube1/'

# initialize example cell reconstruction
ssd = SuperSegmentationDataset()
ssv = list(ssd.ssvs)[0]
ssv.load_skeleton()

# get myelin predictions
myelinated = map_myelin2coords(ssv.skeleton["nodes"])
ssv.skeleton["myelin"] = myelinated
# this will generate a smoothed version at ``ssv.skeleton["myelin_avg10000"]``
majorityvote_skeleton_property(ssv, "myelin")
# store results as a KNOSSOS readable k.zip file
ssv.save_skeleton_to_kzip(dest_path='~/{}_myelin.k.zip'.format(ssv.id),
    additional_keys=['myelin', 'myelin_avg10000'])