syconn.extraction package

syconn.extraction.cs_extraction_steps module

syconn.extraction.cs_extraction_steps.extract_contact_sites(chunk_size=None, log=None, max_n_jobs=None, cube_of_interest_bb=None, n_folders_fs=1000, cube_shape=None, overwrite=False, transf_func_sj_seg=None)[source]

Extracts contact sites and their overlap with sj objects and stores them in a SegmentationDataset of type cs and syn respectively. If synapse type is available, this information will be stored as the voxel-ratio per class in the attribute dictionary of the syn objects (keys: sym_prop, asym_prop). These properties will further be used by combine_and_split_syn() which aggregates per-SV synapse fragments (syn) to per-SSV synapses (syn_ssv).

Examples

The synapse type labels and KnossosDatasets are defined in the config.yml file and can be set initially by changing the following attributes depending on how the synapse type prediction is stored.

(i) The type prediction is stored as segmentation in a single data set with three labels (0: background, 1: symmetric, 2: asymmetric):

kd_asym_path = root_dir + ‘kd_asym_sym/’ kd_sym_path = root_dir + ‘kd_asym_sym/’ key_val_pairs_conf = [

(‘cell_objects’, {‘sym_label’: 1, ‘asym_label’: 2,} )

]
generate_default_conf(working_dir, kd_sym=kd_sym_path, kd_asym=kd_asym_path,

key_value_pairs=key_val_pairs_conf)

(ii) The type prediction is stored as segmentation in a two data sets each with two labels (0: background, 1: symmetric and 0: background, 1: asymmetric):

kd_asym_path = root_dir + ‘kd_asym/’ kd_sym_path = root_dir + ‘kd_sym/’ key_val_pairs_conf = [

(‘cell_objects’, {‘sym_label’: 1, ‘asym_label’: 1,} )

]
generate_default_conf(working_dir, kd_sym=kd_sym_path, kd_asym=kd_asym_path,

key_value_pairs=key_val_pairs_conf)

(iii) The type prediction is stored as probability map in the raw channel (uint8, range: 0..255) in a data set for each type:

kd_asym_path = root_dir + ‘kd_asym/’ kd_sym_path = root_dir + ‘kd_sym/’ key_val_pairs_conf = [

(‘cell_objects’, {‘sym_label’: None, ‘asym_label’: None,} )

]
generate_default_conf(working_dir, kd_sym=kd_sym_path, kd_asym=kd_asym_path,

key_value_pairs=key_val_pairs_conf)

Notes

  • Deletes existing KnossosDataset and SegmentationDataset of type ‘syn’ and ‘cs’!

  • Replaced find_contact_sites, extract_agg_contact_sites, ` syn_gen_via_cset` and extract_synapse_type.

Parameters
  • chunk_size (Optional[Tuple[int, int, int]]) – Sub-cube volume which is processed at a time.

  • log (Optional[Logger]) – Logger.

  • max_n_jobs (Optional[int]) – Maximum number of jobs, only used as a lower bound.

  • cube_of_interest_bb (Optional[ndarray]) – Sub-volume of the data set which is processed. Default: Entire data set.

  • n_folders_fs (int) – Number of folders used for organizing supervoxel data.

  • cube_shape (Optional[Tuple[int]]) – Cube shape used within ‘syn’ and ‘cs’ KnossosDataset.

  • overwrite (bool) – Overwrite existing cache.

  • transf_func_sj_seg (Optional[Callable]) – Method that converts the cell organelle segmentation into a binary mask of background vs. sj foreground.

syconn.extraction.cs_processing_steps module

syconn.extraction.cs_processing_steps.cc_large_voxel_lists(voxel_list, cs_gap_nm, max_concurrent_nodes=5000, verbose=False)[source]

This function identifies connected components within a list of voxels. It uses a k-d tree data structure to efficiently query the nearest neighbors of each voxel. It then groups voxels into connected components based on their proximity to each other.

Parameters
  • voxel_list (list) – A list of voxel coordinates.

  • cs_gap_nm (float) – The maximum distance between two voxels to consider them as part of the same connected component. In nanometers.

  • max_concurrent_nodes (int, optional) – The maximum number of nodes to process concurrently. Defaults to 5000.

  • verbose (bool, optional) – If True, print debug information. Defaults to False.

Returns

A list of sets, where each set contains the indices of voxels that belong to the same connected component.

Return type

list

syconn.extraction.cs_processing_steps.classify_synssv_objects(wd, obj_version=None, log=None, nb_cpus=None)[source]

This function classifies SSV contact sites into synaptic or non-synaptic using an RFC model and stores the result in the attribute dict of the syn_ssv objects. For requirements see synssv_o_features. It takes the working directory, object version, logger and number of CPUs as input and returns nothing.

Parameters
  • wd (str) – Working directory.

  • obj_version (str) – Object version.

  • log (Logger) – Logger.

  • nb_cpus (int) – Number of CPUs.

Returns

None

syconn.extraction.cs_processing_steps.collect_properties_from_ssv_partners(wd, obj_version=None, ssd_version=None, debug=False)[source]

Collect axoness, cell types and spiness from synaptic partners and stores them in syn_ssv objects. Also maps syn_type_sym_ratio to the synaptic sign (-1 for asym., 1 for sym. synapses).

The following keys will be available in the attr_dict of syn_ssv typed SegmentationObject:

  • ‘partner_axoness’: Cell compartment type (axon: 1, dendrite: 0, soma: 2, en-passant bouton: 3, terminal bouton: 4) of the partner neurons.

  • ‘partner_spiness’: Spine compartment predictions (0: dendritic shaft, 1: spine head, 2: spine neck, 3: other) of both neurons.

  • ‘partner_spineheadvol’: Spinehead volume in µm^3.

  • ‘partner_celltypes’: Celltype of the both neurons.

  • ‘latent_morph’: Local morphology embeddings of the pre- and post- synaptic partners.

Parameters
  • wd (str) – The working directory.

  • obj_version (str, optional) – The version of the object. Defaults to None.

  • ssd_version (str, optional) – The version of the super segmentation dataset. Defaults to None.

  • debug (bool, optional) – If True, the function will run in debug mode. Defaults to False.

syconn.extraction.cs_processing_steps.combine_and_split_cs(wd, ssd_version=None, cs_version=None, nb_cpus=None, n_folders_fs=10000, log=None, overwrite=False)[source]

This function creates ‘cs_ssv’ objects from ‘cs’ objects. It computes connected cs-objects on SSV level and re-calculates their attributes (mesh_area, size, etc.). This method performs connected component analysis on the mesh of all cell-cell contacts instead of their voxels.

Notes

  • ‘rep_coord’ property is calculated as the mesh vertex closest to the center

of mass of all mesh vertices.

Parameters
  • wd (str) – The working directory.

  • ssd_version (str, optional) – The version of the super segmentation dataset.

  • None. (Defaults to) –

  • cs_version (str, optional) – The version of the cell segmentation. Defaults to None.

  • nb_cpus (int, optional) – The number of CPUs to use. Defaults to None.

  • n_folders_fs (int, optional) – The number of folders in the file system.

  • 10000. (Defaults to) –

  • log (Logger, optional) – The logger to use. Defaults to None.

  • overwrite (bool, optional) – Whether to overwrite existing files. Defaults to False.

syconn.extraction.cs_processing_steps.combine_and_split_syn(wd, cs_gap_nm=300, ssd_version=None, syn_version=None, nb_cpus=None, n_folders_fs=10000, log=None, overwrite=False)[source]

Creates ‘syn_ssv’ objects from ‘syn’ objects. It computes connected syn-objects on SSV level and aggregates the respective ‘syn’ attributes [‘cs_id’, ‘asym_prop’, ‘sym_prop’, ].

All objects of the resulting ‘syn_ssv’ SegmentationDataset contain the following attributes: [‘syn_sign’, ‘syn_type_sym_ratio’, ‘asym_prop’, ‘sym_prop’, ‘cs_ids’, ‘neuron_partners’]

Notes

  • ‘rep_coord’ property is calculated as the voxel (part of the object) closest to the center of mass of all object voxels.

  • ‘cs_id’/’cs_ids’ is the same as syn_id (‘syn’ are just a subset of ‘cs’, preserving the IDs).

Parameters
  • wd (str) – The working directory.

  • cs_gap_nm (int, optional) – The gap in nm. Defaults to 300.

  • ssd_version (str, optional) – The version of the super segmentation dataset. Defaults to None.

  • syn_version (str, optional) – The version of the synapse dataset. Defaults to None.

  • nb_cpus (int, optional) – The number of CPUs to use. Defaults to None.

  • n_folders_fs (int, optional) – The number of folders in the file system. Defaults to 10000.

  • log (Logger, optional) – The logger for logging the progress and debugging information. Defaults to None.

  • overwrite (bool, optional) – If True, overwrites existing files. Defaults to False.

syconn.extraction.cs_processing_steps.connected_cluster_kdtree(voxel_coords, dist_intra_object, dist_inter_object, scale)[source]

This function identifies connected components within N objects. It performs a two-stage process where it first adds edges between every object voxel which are at most 2 voxels apart. The edges are added to a global graph which is used to calculate connected components. In the second stage, connected components are considered close if they are within a maximum distance of dist_inter_object between a random voxel used as their representative coordinate. Close connected components will then be connected if the minimum distance between any of their voxels is smaller than dist_intra_object.

Parameters
  • voxel_coords (List[np.ndarray]) – List of numpy arrays in voxel coordinates.

  • dist_intra_object (float) – Maximum distance between two voxels of different synapse fragments to consider them the same object. In nm.

  • dist_inter_object (float) – Maximum distance between two objects to check for close voxels between them. In nm.

  • scale (np.ndarray) – Voxel sizes in nm (XYZ).

Returns

Connected components across all N input objects with at most dist_intra_cluster distance.

Return type

List[set]

syconn.extraction.cs_processing_steps.create_syn_rfc(sd_syn_ssv, path2file, overwrite=False, rfc_path_out=None, max_dist_vx=20)[source]

Trains a random forest classifier (RFC) to distinguish between synaptic and non-synaptic objects. Features are generated from the objects in sd_syn_ssv associated with the annotated coordinates stored in path2file. The trained classifier is written to global_params.config.mpath_syn_rfc.

Parameters
  • sd_syn_ssv (segmentation.SegmentationDataset) – SegmentationDataset object of type syn_ssv. Used to identify synaptic object candidates annotated in the kzip/xls file at path2file.

  • path2file (str) – Path to kzip file with synapse labels as node comments (“non-synaptic”, “synaptic”; labels used for classifier are 0 and 1 respectively).

  • overwrite (bool) – If True, existing files will be replaced. Defaults to False.

  • rfc_path_out (str) – Filename for dumped RFC. If None, the default path is used.

  • max_dist_vx (int) – Maximum voxel distance between sample and target. Defaults to 20.

Returns

The trained random forest

classifier and the feature and label data.

Return type

Tuple[ensemble.RandomForestClassifier, np.ndarray, np.ndarray]

syconn.extraction.cs_processing_steps.export_matrix(obj_version=None, dest_folder=None, threshold_syn=0, export_kzip=False, log=None)[source]

Exports the connectivity matrix as a .csv file and optionally as a .kzip file.

Parameters
  • obj_version (str, optional) – Version of the object. Defaults to None.

  • dest_folder (str, optional) – Destination folder for the exported file. Defaults to None.

  • threshold_syn (float, optional) – Threshold for filtering synapses. Defaults to 0.

  • export_kzip (bool, optional) – If True, exports the connectivity matrix as a .kzip file. Note that this can result in large memory consumption. Defaults to False.

  • log (Logger, optional) – Logger for logging the process. Defaults to None.

syconn.extraction.cs_processing_steps.filter_relevant_syn(sd_syn, ssd, log)[source]

Filters the intra-ssv contact sites (inside of an ssv, not between ssvs) that do not need to be agglomerated. This function is also applicable to cs.

Parameters
  • sd_syn (segmentation.SegmentationDataset) – The segmentation dataset of synapses.

  • ssd (super_segmentation.SuperSegmentationDataset) – The super segmentation dataset.

  • log (Logger) – The logger for logging the progress and debugging information.

Returns

A dictionary where the keys are encoded SSV partner IDs and the values are lists of SV synapse object IDs. See sv_id_to_partner_ids_vec() for decoding into SSV IDs.

Return type

Dict[int, list]

syconn.extraction.cs_processing_steps.map_objects_from_synssv_partners(wd, obj_version=None, ssd_version=None, n_jobs=None, debug=False, log=None, max_rep_coord_dist_nm=None)[source]

This function maps sub-cellular objects of the synaptic partners of ‘syn_ssv’ objects and stores them in their attribute dict. The following keys will be available in the attr_dict of syn_ssv-typed SegmentationObject:

  • ‘n_mi_objs_%d’:

  • ‘n_mi_vxs_%d’:

  • ‘min_dst_mi_nm_%d’:

  • ‘n_vc_objs_%d’:

  • ‘n_vc_vxs_%d’:

  • ‘min_dst_vc_nm_%d’:

Parameters
  • wd (str) – The working directory.

  • obj_version (str, optional) – The version of the ‘syn_ssv’ dataset. Defaults to None.

  • ssd_version (str, optional) – The version of the ‘ssv’ dataset. Defaults to None.

  • n_jobs (int, optional) – The number of jobs to run in parallel. Defaults to None.

  • debug (bool, optional) – If True, print debug information. Defaults to False.

  • log (Logger, optional) – The logger to use for logging debug information. Defaults to None.

  • max_rep_coord_dist_nm (float, optional) – The maximum distance between the representative coordinate of a synapse and a sub-cellular object to consider them as connected. In nanometers. Defaults to None.

syconn.extraction.cs_processing_steps.synssv_o_featurenames()[source]

Returns a list of feature names used for synapse prediction.

Returns

A list of feature names.

Return type

list

syconn.extraction.cs_processing_steps.synssv_o_features(synssv_o)[source]

Collects syn_ssv feature for synapse prediction using an RFC.

Parameters
Returns

A list of features for the given SegmentationObject.

Return type

list

syconn.extraction.cs_processing_steps.write_conn_gt_kzips(conn, n_objects, folder)[source]

This function writes .k.zip summary files of connectivity matrix. It takes a connectivity matrix, number of objects and a folder as input and returns nothing.

Parameters
  • conn – Connectivity matrix.

  • n_objects – Number of objects.

  • folder – Folder to write .k.zip files.

syconn.extraction.object_extraction_steps module

syconn.extraction.object_extraction_steps.apply_merge_list(cset, chunk_list, filename, hdf5names, merge_list_dict, debug, suffix='', n_chunk_jobs=None, nb_cpus=1)[source]

Applies merge list to all chunks. This function is used to apply the merge list to all chunks in the chunkdataset.

Parameters
  • cset (ChunkDataset) – Instance of the chunkdataset.

  • chunk_list (list) – List of chunks for which this function should work on. If None, all chunks are used.

  • filename (str) – Name of the prediction file in the chunkdataset.

  • hdf5names (list) – List of labels to be extracted and processed from the prediction file.

  • merge_list_dict (dict) – Mergedict for each hdf5name.

  • debug (bool) – If true, multiprocessed steps only operate on one core using ‘map’ for better error messages.

  • suffix (str) – Suffix for the intermediate results.

  • n_chunk_jobs (int) – Number of total jobs.

  • nb_cpus (int) – Number of cores used per worker.

syconn.extraction.object_extraction_steps.export_cset_to_kd_batchjob(target_kd_paths, cset, name, hdf5names, n_cores=1, offset=None, size=None, stride=(512, 512, 512), overwrite=False, as_raw=False, fast_downsampling=False, n_max_job=None, unified_labels=False, orig_dtype=<class 'numpy.uint8'>, log=None, compresslevel=None)[source]

This function exports a chunk dataset to a Knossos dataset in a batch job. It is a batch job version of the ChunkDataset.export_cset_to_kd method.

Notes

  • KnossosDataset needs to be initialized beforehand (see initialize_without_conf()).

  • Only works if data mag = 1.

Parameters
  • target_kd_paths (dict) – The target Knossos datasets.

  • cset (ChunkDataset) – The source chunk dataset.

  • name (str) – The name of the chunk dataset.

  • hdf5names (list) – The names of the HDF5 files.

  • n_cores (int, optional) – The number of cores to use. Defaults to 1.

  • offset (tuple, optional) – The offset for the chunk dataset. Defaults to None.

  • size (tuple, optional) – The size of the chunk dataset. Defaults to None.

  • stride (tuple, optional) – The stride for the chunk dataset. Defaults to (4 * 128, 4 * 128, 4 * 128).

  • overwrite (bool, optional) – Whether to overwrite existing data. Defaults to False.

  • as_raw (bool, optional) – Whether to save the data as raw data. Defaults to False.

  • fast_downsampling (bool, optional) – Whether to use fast downsampling. Defaults to False.

  • n_max_job (int, optional) – The maximum number of jobs. Defaults to None.

  • unified_labels (bool, optional) – Whether to use unified labels. Defaults to False.

  • orig_dtype (np.dtype, optional) – The original data type. Defaults to np.uint8.

  • log (str, optional) – The log file. Defaults to None.

  • compresslevel (int, optional) – The compression level for segmentation data. Defaults to None.

Returns

None

syconn.extraction.object_extraction_steps.gauss_threshold_connected_components(*args, **kwargs)[source]

This function is an alias for the object_segmentation function. It takes in any number of arguments and keyword arguments and passes them directly to the object_segmentation function.

syconn.extraction.object_extraction_steps.make_merge_list(hdf5names, stitch_list, max_labels)[source]

Creates a merge list from a stitch list by mapping all connected ids to one id. This function is used to create a list of labels that need to be merged based on the stitch list.

Parameters
  • hdf5names (list) – List of labels to be extracted and processed from the prediction file.

  • stitch_list (dict) – Contains pairs of overlapping component ids for each hdf5name.

  • max_labels (dict) – Contains the number of different component ids for each hdf5name.

Returns

Mergelist for each hdf5name. merge_list_dict (dict): Mergedict for each hdf5name.

Return type

merge_dict (dict)

syconn.extraction.object_extraction_steps.make_stitch_list(cset, filename, hdf5names, chunk_list, stitch_overlap, overlap, debug, suffix='', nb_cpus=None, overlap_thresh=0, n_chunk_jobs=None)[source]

Creates a stitch list for the overlap region between chunks. This function is used to identify the overlapping regions between chunks and create a list of these regions for further processing.

Parameters
  • cset (ChunkDataset) – Instance of the chunkdataset.

  • filename (str) – Name of the prediction file in the chunkdataset.

  • hdf5names (list) – List of labels to be extracted and processed from the prediction file.

  • chunk_list (list) – List of chunks for which this function should work on. If None, all chunks are used.

  • stitch_overlap (np.array) – Defines the overlap with neighbouring chunks that is left for stitching.

  • overlap (np.array) – Defines the overlap with neighbouring chunks that is left for later processing steps.

  • debug (bool) – If true, multiprocessed steps only operate on one core using ‘map’ for better error messages.

  • suffix (str) – Suffix for the intermediate results.

  • nb_cpus (int) – Number of cores used per worker.

  • overlap_thresh (float) – Overlap fraction of object in different chunks to be considered stitched.

  • n_chunk_jobs (int) – Number of total jobs.

Returns

Dictionary of overlapping component ids.

Return type

stitch_list (dict)

syconn.extraction.object_extraction_steps.make_unique_labels(cset, filename, hdf5names, chunk_list, max_nb_dict, chunk_translator, debug, suffix='', n_chunk_jobs=None, nb_cpus=1)[source]

This function makes labels unique across chunks.

Parameters
  • cset (ChunkDataset instance) – Instance of the ChunkDataset.

  • filename (str) – Filename of the prediction in the ChunkDataset.

  • hdf5names (list) – List of names/ labels to be extracted and processed from the prediction file.

  • chunk_list (list) – Selective list of chunks for which this function should work on. If None, all chunks are used.

  • max_nb_dict (dict) – Maps each chunk id to an integer describing which needs to be added to all its entries.

  • chunk_translator (dict) – Remapping from chunk ids to position in chunk_list.

  • debug (bool) – If true, multiprocessed steps only operate on one core using ‘map’ which allows for better error messages.

  • suffix (str) – Suffix for the intermediate results.

  • n_chunk_jobs (int) – Number of total jobs.

  • nb_cpus (int) – Number of cores used per worker.

syconn.extraction.object_extraction_steps.object_segmentation(cset, filename, hdf5names, overlap='auto', sigmas=None, thresholds=None, chunk_list=None, debug=False, swapdata=False, prob_kd_path_dict=None, membrane_filename=None, membrane_kd_path=None, hdf5_name_membrane=None, fast_load=False, suffix='', nb_cpus=None, transform_func=None, transform_func_kwargs=None, transf_func_kd_overlay=None, load_from_kd_overlaycubes=False, n_chunk_jobs=None)[source]

Extracts connected components from probability maps using a default procedure of Gaussian filtering, thresholding, and connected components analysis. If a transform_func is provided, it is applied by each worker on the chunk’s probability map to generate the segmentation instead.

In case of vesicle clouds, the membrane segmentation is used to cut connected vesicle clouds across cells apart (only if membrane segmentation is provided).

Parameters
  • cset (ChunkDataset) – Instance of the chunkdataset.

  • filename (str) – Filename of the prediction in the ChunkDataset.

  • hdf5names (list) – List of names/labels to be extracted and processed from the prediction file.

  • overlap (str or np.array) – Defines the overlap with neighbouring chunks left for later processing steps.

  • sigmas (list) – Defines the sigmas of the gaussian filters applied to the probability maps.

  • thresholds (list) – Threshold for cutting the probability map.

  • chunk_list (list) – Selective list of chunks for this function to work on.

  • debug (bool) – If true, multiprocessed steps only operate on one core using ‘map’.

  • swapdata (bool) – If true, an x-z swap is applied to the data prior to processing.

  • prob_kd_path_dict (dict) – Dictionary containing probability knossosdataset paths.

  • membrane_filename (str) – Filename of the prediction in the chunkdataset for membrane segmentation.

  • membrane_kd_path (str) – Path to the knossosdataset containing a membrane segmentation.

  • hdf5_name_membrane (str) – Key to access the data in the saved chunk when using the membrane_filename.

  • fast_load (bool) – If true, the data of chunk is loaded without checking for enough offset.

  • suffix (str) – Suffix for the intermediate results.

  • nb_cpus (int) – Number of CPUs to use.

  • transform_func (callable) – Segmentation method which is applied.

  • transform_func_kwargs (dict) – Key word arguments for transform_func.

  • transf_func_kd_overlay (callable) – Method applied to cube data if load_from_kd_overlaycubes is True.

  • load_from_kd_overlaycubes (bool) – Load prob/seg data from overlaycubes instead of raw cubes.

  • n_chunk_jobs (int) – Number of total jobs.

Returns

List containing information about the number of connected components in each chunk. overlap (np.array): Overlap array. stitch_overlap (np.array): Stitch overlap array.

Return type

results_as_list (list)

syconn.extraction.object_extraction_wrapper module

syconn.extraction.object_extraction_wrapper.calculate_chunk_numbers_for_box(cset, offset, size)[source]

This function calculates the chunk ids that are (partly) contained in the defined volume. It takes in a ChunkDataset, an offset of the volume to the origin, and the size of the volume. It returns a list of chunk ids and a dictionary with reverse mapping.

Parameters
  • cset (ChunkDataset) – The ChunkDataset to calculate chunk ids for.

  • offset (np.array) – The offset of the volume to the origin.

  • size (np.array) – The size of the volume.

Returns

The list of chunk ids. dictionary (dict): A dictionary with reverse mapping.

Return type

chunk_list (list)

syconn.extraction.object_extraction_wrapper.from_probabilities_to_kd(target_kd_paths, cset, filename, hdf5names, prob_kd_path_dict=None, load_from_kd_overlaycubes=False, transf_func_kd_overlay=None, log=None, overlap='auto', sigmas=None, thresholds=None, debug=False, swapdata=False, offset=None, size=None, suffix='', transform_func=None, func_kwargs=None, n_cores=None, overlap_thresh=0, stitch_overlap=None, membrane_filename=None, membrane_kd_path=None, hdf5_name_membrane=None, n_chunk_jobs=None)[source]

Converts classified or predicted data into a ChunkDataset or KnossosDataset(s). The ChunkDataset is used to store intermediate extraction results such as per-cube segmentation, stitched results, and globally unique segmentation. The function requires pre-initialized KnossosDatasets given by target_kd_paths.

Parameters
  • target_kd_paths (Optional[Dict[str, str]]) – Paths to pre-initialized output KnossosDatasets.

  • cset (chunky.ChunkDataset) – ChunkDataset used for object extraction and may contain source data.

  • filename (str) – Base name used to store the extracted in cset.

  • hdf5names (List[str]) – Keys used to store intermediate extraction results.

  • prob_kd_path_dict (Optional[Dict[str, str]]) – Paths to source KnossosDatasets.

  • load_from_kd_overlaycubes (bool) – If True, load prob/seg data from overlaycubes instead of raw cubes.

  • transf_func_kd_overlay (Optional[Dict[str, Callable]]) – Method applied to cube data if load_from_kd_overlaycubes is True.

  • log (Optional[Logger]) – Logger for logging events.

  • overlap (str) – Defines overlap with neighbouring chunks left for later processing steps.

  • sigmas (Optional[list]) – Defines sigmas of Gaussian filters applied to probability maps.

  • thresholds (Optional[list]) – Threshold for cutting probability map.

  • debug (bool) – If True, multiprocessing steps only operate on one core using ‘map’.

  • swapdata (bool) – If True, an x-z swap is applied to data prior to processing.

  • offset (Optional[np.ndarray]) – Offset of processed volume.

  • size (Optional[np.ndarray]) – Size of processed volume of dataset starting at offset.

  • suffix (str) – Suffix used for intermediate processing steps.

  • transform_func (Optional[Callable]) – Segmentation method applied.

  • func_kwargs (Optional[dict]) – Keyword arguments for transform_func.

  • n_cores (Optional[int]) – Number of cores used for each job.

  • overlap_thresh (Optional[int]) – Overlap fraction of object in different chunks to be considered stitched.

  • stitch_overlap (Optional[int]) – Volume evaluated during stitching procedure.

  • membrane_filename (str) – Filename of prediction in chunkdataset for accessing membrane segmentation.

  • membrane_kd_path (str) – Path to knossosdataset containing a membrane segmentation.

  • hdf5_name_membrane (str) – Key to access data in saved chunk when membrane_filename is set.

  • n_chunk_jobs (int) – Number of jobs.

Returns

None

syconn.extraction.object_extraction_wrapper.generate_subcell_kd_from_proba(subcell_names, chunk_size=None, transf_func_kd_overlay=None, load_cellorganelles_from_kd_overlaycubes=False, cube_of_interest_bb=None, cube_shape=None, log=None, overwrite=False, **kwargs)[source]

This function generates a connected components segmentation for the given sub-cellular structures (e.g. [‘mi’, ‘sj’, ‘vc]) as KnossosDatasets. The data format of the source data is KnossosDataset which path(s) is defined in global_params.config['paths'] (e.g. key kd_mi_path for mitochondria). The resulting KDs will be stored at (for each co in subcell_names) "{}/knossosdatasets/{}_seg/".format(global_params.config.working_dir, co). See from_probabilities_to_kd() for details of the conversion process from the initial probability map to the SV segmentation. Default: thresholding and connected components, thresholds are set via the config.yml file, check syconn.global_params.config['cell_objects']["probathresholds"] of an initialized DynConfig object.

Parameters
  • subcell_names (List[str]) – List of subcellular structures to generate segmentation for.

  • chunk_size (Optional[Union[list, tuple]]) – Size of the chunks to be processed.

  • transf_func_kd_overlay (Optional[Dict[str, Callable]]) – Transformation function for overlay.

  • load_cellorganelles_from_kd_overlaycubes (bool) – Flag to load cell organelles from overlay cubes.

  • cube_of_interest_bb (Optional[Tuple[np.ndarray]]) – Bounding box of the cube of interest.

  • cube_shape (Optional[Tuple[int]]) – Shape of the cube.

  • log (Logger) – Logger for logging the process.

  • overwrite (bool) – Flag to overwrite existing data.

  • **kwargs – Additional keyword arguments.

Returns

None