iss_preprocess.pipeline package¶
Submodules¶
iss_preprocess.pipeline.ara_registration module¶
- iss_preprocess.pipeline.ara_registration.check_reg(data_path, save_folder, rois=None, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
- iss_preprocess.pipeline.ara_registration.crop_overview_registration(data_path, rois=None, overview_prefix='DAPI_1_1')¶
Crop the registered overview to the same size as the reference
- Parameters:
data_path (str) – Relative path to data
rois (list, optional) – List of rois to crop. Defaults to None.
overview_prefix (str, optional) – Prefix of the overview image. Defaults to “DAPI_1_1”.
- Returns:
List of cropped images
- Return type:
imgs (list)
- iss_preprocess.pipeline.ara_registration.find_roi_position_on_cryostat(data_path)¶
Find the A/P position of each ROI relative to the first collected slice
The section order is guess from the sign of section_thickness_um, positive for antero-posterior slicing (starting from the olfactory bulb), negative for opposite.
- Parameters:
data_path (str) – Relative path to the data
- Returns:
- For each ROI, the slice depth in um relative to the
first collected slice
min_step (float): Minimum thickness between two slices
- Return type:
roi_slice_pos_um (dict)
- iss_preprocess.pipeline.ara_registration.load_coordinate_image(data_path, roi, full_scale=False, registered=True, return_fname=False)¶
Load the 3 channel image of ARA coordinates for roi
The reference atlas is first registered to a downsampled version of the overview, this is then registered to the normal acquisition. The coordinates of the overview can be loaded with registered=False.
- Parameters:
data_path (str) – Relative path to data
roi (int) – Number of the ROI
full_scale (bool, optional) – If true, returns the full scale image, otherwise the downsample version used for registration. Defaults to False.
registered (bool, optional) – If True, load the registered coordinates, otherwise the coordinates of the overview, before shifting/cropping. Defaults to True.
return_fname (bool, optional) – If True, return the filename of the image. Defaults to False.
- Returns:
3 channel image of ARA coordinates
- Return type:
coords (np.ndarray)
- iss_preprocess.pipeline.ara_registration.load_registration_reference(data_path, roi)¶
Load the registration reference image of one ROI
This is the downsampled version of the overview image used for registration.
- Parameters:
data_path (str) – Relative path to data
roi (int) – Number of the ROI
- Returns:
Registration reference image
- Return type:
ref (np.ndarray)
- iss_preprocess.pipeline.ara_registration.load_registration_reference_metadata(data_path, roi)¶
Load metadata file associated with registration reference of one ROI
This is the “registration_reference_r{roi}_sl{slice_number}.yml” file that contains shape and downsampling info.
- Parameters:
data_path (str) – Relative path to data
roi (int) – Number of the roi
- Returns:
Content of the metadata yml file
- Return type:
metadata (dict)
- iss_preprocess.pipeline.ara_registration.make_area_image(data_path, roi, atlas_size=10, full_scale=False, reload=True, registered=True)¶
Generate an image with area ID in each pixel
- Parameters:
data_path (str) – Relative path to data
roi (int) – Roi number to generate
atlas_size (int, optional) – Pixel size of the atlas used to find area id. Defaults to 10.
full_scale (bool, optional) – If true, returns the full scale image, otherwise the downsample version used for registration. Defaults to False.
reload (bool, optional) – If True, reload the area image, otherwise recompute it. Valid only if full_scale is False. Defaults to True.
registered (bool, optional) – If True, load the registered coordinates, otherwise the coordinates of the overview, before shifting/cropping. Defaults to True.
- Returns:
Image with area id of each pixel
- Return type:
area_id (np.array)
- iss_preprocess.pipeline.ara_registration.overview_single_roi(data_path, roi, slice_id, prefix, chan2use=(0, 1, 2, 3), sigma_blur=10, agg_func=<function nanmean>, ref_prefix='genes_round', subresolutions=5, max_pixel_size=2, non_similar_overview=False)¶
Stitch and save a single ROI overview for use in atlas registration
- Parameters:
data_path (str) – Relative path to data
roi (int) – Number of the ROI
slice_id (int) – Slice number to stitch
prefix (str, optional) – Prefix of the acquisition to plot.
chan2use (tuple, optional) – Channels to use for stitching. Defaults to (0, 1, 2, 3).
sigma_blur (int, optional) – Sigma for gaussian blur. Defaults to 10.
agg_func (function, optional) – Aggregation function to apply across channels. Defaults to np.nanmean. Unused if non_similar_overview is True.
ref_prefix (str, optional) – Prefix of the reference image. Defaults to “genes_round”.
subresolutions (int, optional) – Number of subresolutions to save. Defaults to 5.
max_pixel_size (int, optional) – Maximum pixel size for the pyramid. Defaults to 2.
non_similar_overview (bool, optional) – If True, stitch the overview tiles with
by (the stitch_tiles function rather than stitch_registered which requires tile) – tile registration to the reference. Defaults to False.
- iss_preprocess.pipeline.ara_registration.register_overview_to_reference(data_path, roi, channel, overview_prefix='DAPI_1_1', *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Register the overview to the reference image
- Parameters:
data_path (str) – Relative path to data
roi (int) – Number of the ROI
channel (int) – Channel to use for registration.
downsample (int, optional) – Downsample factor. Defaults to 3.
overview_prefix (str, optional) – Prefix of the overview image. Defaults to “DAPI_1_1”.
- Returns:
Shift in x and y final_shape (tuple): Final shape of the stitched images stitched_fixed (np.ndarray): Stitched reference image stitched_moving (np.ndarray): Stitched overview image
- Return type:
shift (np.ndarray)
- iss_preprocess.pipeline.ara_registration.spots_ara_infos(data_path, spots, roi, atlas_size=10, acronyms=True, inplace=True, full_scale_coordinates=False, reload=True, verbose=True)¶
Add ARA coordinates and area ID to spots dataframe
- Parameters:
data_path (str) – Relative path to data
spots (pd.DataFrame) – Spots dataframe
atlas_size (int, optional) – Atlas size (10, 25 or 50) for find areas borders. Defaults to 10
acronyms (bool, optional) – Add an acronym column with area name. Defaults to False.
inplace (bool, optional) – add the column to spots inplace or return a copy. Defaults to True
full_scale_coordinates (bool, optional) – If true, use the full scale image to find coordinates, otherwise the downsample version used for registration. Defaults to False.
reload (bool, optional) – If True, reload the area image, otherwise recompute it. Valid only if full_scale is False. Defaults to True.
verbose (bool, optional) – Print progress. Defaults to True.
- Returns:
- reference or copy of spots dataframe with four more
columns: ara_x, ara_y, ara_z, and area_id
- Return type:
spots (pd.DataFrame)
iss_preprocess.pipeline.hybridisation module¶
- iss_preprocess.pipeline.hybridisation.estimate_channel_correction_hybridisation(data_path, prefix=None, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Compute grayscale value distribution and normalisation factors for all hybridisation rounds.
Each correction_tiles of ops is filtered before being used to compute the distribution of pixel values. Normalisation factor to equalise these distribution across channels and rounds are defined as ops[“correction_quantile”] of the distribution.
- Parameters:
data_path (str or Path) – Relative path to the data folder
prefix (list, optional) – List of prefix of hybridisation rounds to process. If None, all hybridisation rounds are processed. Defaults to None.
- Returns:
- A 65536 x Nch x Nrounds distribution of grayscale values
for filtered stacks
norm_factors (np.array) A Nch x Nround array of normalisation factors
- Return type:
pixel_dist (np.array)
- iss_preprocess.pipeline.hybridisation.extract_hyb_spots_all(data_path)¶
Start sbatch jobs to detect hybridisation spots for each hybridisation round and ROI.
- Parameters:
data_path (str) – Relative path to data.
- iss_preprocess.pipeline.hybridisation.extract_hyb_spots_roi(data_path, prefix, roi)¶
Detect hybridisation spots for a given hybridisation round and ROI.
- Parameters:
data_path (str) – Relative path to data.
prefix (str) – Prefix of the hybridisation round, e.g. “hybridisation_1_1”.
roi (int) – ID of the ROI to process, as specified in MicroManager (i.e. 1-based)
- iss_preprocess.pipeline.hybridisation.extract_hyb_spots_tile(data_path, tile_coors, prefix, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Detect hybridisation spots for a given tile.
- Parameters:
data_path (str) – Relative path to data.
tile_coors (tuple) – Coordinates of tile to load: ROI, Xpos, Ypos.
prefix (str) – Prefix of the hybridisation round, e.g. “hybridisation_1_1”.
- iss_preprocess.pipeline.hybridisation.hyb_spot_cluster_means(data_path, prefix)¶
Estimate bleedthrough matrices for hybridisation spots. Spot colors for each dye are initialized based on the metadata in the hybridisation probe list.
Uses tiles specified in ops[“barcode_ref_tiles”].
- Parameters:
data_path (str) – Relative path to data.
prefix (str) – Prefix of hybridisation round, e.g. “hybridisation_1_1”.
- Returns:
Nprobes x Nch bleedthrough matrix. pandas.DataFrame: DataFrame of all detected spots across all tiles. list: list of gene names based on probe metadata.
- Return type:
numpy.ndarray
- iss_preprocess.pipeline.hybridisation.load_and_register_hyb_tile(data_path, tile_coors=(1, 0, 0), prefix='hybridisation_1_1', suffix='max', filter_r=(2, 4), correct_illumination=False, correct_channels=False, corrected_shifts='best')¶
Load hybridisation tile and align channels. Optionally, filter, correct illumination and channel brightness.
- Parameters:
data_path (str) – Relative path to data.
tile_coors (tuple, options) – Coordinates of tile to load: ROI, Xpos, Ypos. Defaults to (1, 0, 0).
prefix (str, optional) – Prefix of the hybridisation round. Defaults to “hybridisation_1_1”.
suffix (str, optional) – Filename suffix corresponding to the z-projection to use. Defaults to “fstack”.
filter_r (tuple, optional) – Inner and out radius for the hanning filter. If False, stack is not filtered. Defaults to (2, 4).
correct_illumination (bool, optional) – Whether to correct vignetting. Defaults to False.
correct_channels (bool, optional) – Whether to normalize channel brightness. Defaults to False.
correct_shifts (str, optional) – Which shift to use. One of reference, single_tile, ransac, or best. Defaults to ‘best’.
- Returns:
X x Y x Nch image stack. numpy.ndarray: X x Y boolean mask, identifying bad pixels that we were not
imaged for all channels (due to registration offsets) and should be discarded during analysis.
- Return type:
numpy.ndarray
- iss_preprocess.pipeline.hybridisation.setup_hyb_spot_calling(data_path, prefix=None, vis=True, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Prepare and save bleedthrough matrices for hybridisation rounds.
- Parameters:
data_path (str) – Relative path to data
prefix (list, optional) – List of prefix of hybridisation rounds to process. If None, all hybridisation rounds are processed. Defaults to None.
vis (bool, optional) – Whether to generate diagnostic plots. Defaults to True.
iss_preprocess.pipeline.pipeline module¶
- iss_preprocess.pipeline.pipeline.call_spots(data_path, genes=True, barcodes=True, hybridisation=True, force_redo=False, setup_only=False, use_slurm=True)¶
Master method to run spot calling.
Must be run after iss project-and-average and iss register.
- Parameters:
data_path (str) – Relative path to the data folder
genes (bool, optional) – Run genes spot calling. Defaults to True.
barcodes (bool, optional) – Run barcode calling. Defaults to True.
hybridisation (bool, optional) – Run hybridisation spot calling. Defaults to True
force_redo (bool, optional) – Redo all processing steps? Defaults to False.
setup_only (bool, optional) – Only setup the spot calling, do not run it.
use_slurm (bool, optional) – Whether to use SLURM to run the jobs. Defaults to True.
- iss_preprocess.pipeline.pipeline.correct_shifts(data_path, prefix, use_slurm=True, job_dependency=None)¶
Correct X-Y shifts using robust regression across tiles.
- iss_preprocess.pipeline.pipeline.create_all_single_averages(data_path, n_batch, todo=('genes_rounds', 'barcode_rounds', 'fluorescence', 'hybridisation'), to_average=None, dependency=None, use_slurm=True, force_redo=False)¶
Average all tiffs in each folder and then all folders by acquisition type
- Parameters:
data_path (str) – Path to data, relative to project.
n_batch (int) – Number of batch to average before taking their median. If None, will do as many batches as images.
todo (tuple) – type of acquisition to process. Default to (“genes_rounds”, “barcode_rounds”, “fluorescence”, “hybridisation”). Ignored if to_average is not None.
to_average (list, optional) – List of folders to average. If None, will average all folders listed in metadata. Defaults to None.
dependency (list, optional) – List of job IDs to wait for before starting the current job. Defaults to None.
use_slurm (bool, optional) – Submit jobs to slurm. Defaults to True.
force_redo (bool, optional) – Redo if the average already exists. Defaults to False.
- iss_preprocess.pipeline.pipeline.create_grand_averages(data_path, prefix_todo=('genes_round', 'barcode_round', ''), suffix_todo=('max', 'median'), n_batch=None, dependency=None, use_slurm=True, force_redo=False)¶
Average single acquisition averages into grand average
- Parameters:
data_path (str) – Path to the folder, relative to projects folder
suffix (str) – Projection suffix to filter tifs. Defaults to None.
prefix_todo (tuple, optional) – List of str, names of the tifs to average. An empty string will average all tifs. Defaults to (“genes_round”, “barcode_round”, “”).
suffix_todo (list, optional) – List of str, suffixes to filter tifs. Defaults to (‘max’, ‘median’).
n_batch (int, optional) – Number of batch to average before taking their median. If None, will do as many batches as images. Defaults to None.
dependency (list, optional) – List of job IDs to wait for before starting the current job. Defaults to None.
use_slurm (bool, optional) – Submit jobs to slurm. Defaults to True.
force_redo (bool, optional) – Redo if the average already exists. Defaults to False.
- iss_preprocess.pipeline.pipeline.create_single_average(data_path, subfolder, subtract_black, n_batch, prefix_filter=None, suffix=None, target_fname=None, combine_tilestats=False, exclude_tiffs=None, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Create normalised average of all tifs in a single folder.
If prefix_filter is not None, the output will be “{prefix_filter}_average.tif”, otherwise it will be “{folder_path.name}_average.tif”
- Other arguments are read from ops:
average_clip_value: Value to clip images before averaging. normalise: Normalise output maximum to one.
- Parameters:
data_path (str) – Path to the acquisition folder, relative to projects folder
subfolder (str) – subfolder in folder_path containing the tifs to average.
subtract_black (bool) – Subtract black level (read from ops)
n_batch (int) – Number of batch to average before taking their median. If None, will do as many batches as images.
prefix_filter (str, optional) – prefix name to filter tifs. Only file starting with prefix will be averaged. Defaults to None.
suffix (str, optional) – suffix to filter tifs. Defaults to None
target_fname (str, optional) – Target file name to save the average. Defaults to None
combine_tilestats (bool, optional) – Compute new tilestats distribution of averaged images if True, combine pre-existing tilestats into one otherwise. Defaults to False
exclude_tiffs (list, optional) – List of str filter to exclude tiffs from average
- Returns:
Average image np.array: Distribution of pixel values
- Return type:
np.array
- iss_preprocess.pipeline.pipeline.overview_for_ara_registration(data_path, prefix, rois_to_do=None, sigma_blur=10, ref_prefix='genes_round', non_similar_overview=False)¶
Generate a stitched overview for registering to the ARA
ABBA requires pyramidal OME-TIFF with resolution information. We will generate such stitched files and save them with a log yaml file indicating info about downsampling
- Parameters:
data_path (str) – Relative path to the data folder
prefix (str) – Acquisition to use for the overview e.g. genes_round_1_1
rois_to_do (list, optional) – ROIs to process. If None (default), process all ROIs
sigma_blur (float, optional) – sigma of the gaussian filter, in downsampled pixel size. Defaults to 10
ref_prefix (str, optional) – Prefix of the reference coordinates. Defaults to genes_round
- iss_preprocess.pipeline.pipeline.project_and_average(data_path, force_redo=False, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Project and average all available data then create plots.
Creates a list of expected acquisition folders from metadata Checks for the existence of expected folders in the raw data and determines the completion status of each acquisition type. Runs projection on unprojected data and reprojects failed tiles. Creates averages of projections and then plots overview images.
- Parameters:
data_path (str) – Relative path to data.
force_redo (bool, optional) – Redo all processing steps? Defaults to False.
- Returns:
A list of job IDs for the slurm jobs created.
- Return type:
po_job_ids (list)
- iss_preprocess.pipeline.pipeline.register_acquisition(data_path, prefix, force_redo=False, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Register an acquisition across all rounds and channels
- Parameters:
path (str) – Path to the data folder
prefix (str) – Prefix of the acquisition to register
force_redo (bool, optional) – Redo if files exist. Defaults to False.
- iss_preprocess.pipeline.pipeline.register_reference_tile(data_path, prefix='genes_round', diag=False, use_slurm=True, force_redo=False)¶
Register the reference tile across channels and rounds
This function estimates the shifts and rotations between rounds and channels using the reference tile and generates diagnostic plots if requested.
- Parameters:
data_path (str) – Relative path to data.
prefix (str, optional) – Directory prefix to use, e.g. ‘genes_round’. Defaults to ‘genes_round’.
diag (bool, optional) – Save diagnostic plots. Defaults to False.
use_slurm (bool, optional) – Submit job to slurm. Defaults to True.
redo (bool, optional) – Redo if files exist. Defaults to False.
- iss_preprocess.pipeline.pipeline.segment_and_stitch_mcherry_cells(data_path, prefix, use_slurm=True, slurm_folder=None, job_dependency=None)¶
Master function for mCherry cell segmentation and stitching
Will call in turn the following functions: - segment_mcherry_cells - filter_mcherry_cells if ops[‘filter_mask’] is True - register_within to find overlapping region (with reload=True) - remove_duplicate - stitch_mcherry_cells
- Parameters:
data_path (str) – Relative path to the data folder
prefix (str) – Prefix of the mCherry acquisition
use_slurm (bool, optional) – Whether to use SLURM to run the jobs. Defaults to True.
slurm_folder (str, optional) – Folder to save SLURM logs. Defaults to None.
job_dependency (list, optional) – List of job IDs to wait for before starting the
- iss_preprocess.pipeline.pipeline.setup_channel_correction(data_path, prefix_to_do=None, force_redo=False, use_slurm=True)¶
Setup channel correction for barcode, genes and hybridisation rounds
- Parameters:
data_path (str) – Relative path to the data folder
prefix_to_do (list, optional) – Prefixes to process. Defaults to None.
force_redo (bool, optional) – Redo all processing steps? Defaults to False.
use_slurm (bool, optional) – Whether to use SLURM to run the jobs. Defaults to True.
- Returns:
List of job IDs for the slurm jobs created
- Return type:
list
iss_preprocess.pipeline.project module¶
- iss_preprocess.pipeline.project.check_projection(data_path, prefix, suffixes=('max', 'median'), *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Check if all tiles have been projected successfully.
- Parameters:
data_path (str) – Relative path to data.
prefix (str) – Acquisition prefix, e.g. “genes_round_1_1”.
suffixes (tuple, optional) – Projection suffixes to check for.
to (Defaults)
- iss_preprocess.pipeline.project.check_roi_dims(data_path, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Check if all ROI dimensions are the same across rounds. :param data_path: Relative path to data. :type data_path: str
- Raises:
ValueError – If ROI dimensions are not the same across rounds.
- iss_preprocess.pipeline.project.project_round(data_path, prefix, overwrite=False)¶
Start SLURM jobs to z-project all tiles from a single imaging round. Also, copy one of the MicroManager metadata files from raw to processed directory.
- Parameters:
data_path (str) – Relative path to dataset.
prefix (str) – Full folder name prefix, including round number.
overwrite (bool, optional) – Whether to re-project if files already exist. Defaults to False.
- iss_preprocess.pipeline.project.project_tile(fname, ops, overwrite=False, sth=13, target_name=None, verbose=True)¶
Calculates projections for a single tile.
- Parameters:
fname (str) – path to tile without ‘.ome.tif’ extension.
ops (dict) – dictionary of values from the ops file.
overwrite (bool, optional) – whether to repeat if already completed. Defaults to False.
sth (int, optional) – size of the structuring element for the fstack projection. Used only if make_fstack is True. Defaults to 13.
target_name (str, optional) – name of the target file. If None, it will be the same as the input file. Defaults to None.
verbose (bool, optional) – print progress. Defaults to True.
- iss_preprocess.pipeline.project.project_tile_by_coors(tile_coors, data_path, prefix, overwrite=False)¶
Project a single tile by its coordinates.
- Parameters:
tile_coors (tuple) – (roi, x, y) coordinates of the tile.
data_path (str) – Relative path to data.
prefix (str) – Acquisition prefix, e.g. “genes_round_1_1”.
overwrite (bool, optional) – Whether to re-project if files already exist. Defaults to False.
- iss_preprocess.pipeline.project.project_tile_row(data_path, prefix, tile_roi, tile_row, max_col, overwrite=False)¶
Calculate max intensity and extended DOF projections for a row of tiles in an ROI
- Parameters:
data_path (str) – relative path to dataset
prefix (str) – directory / file name prefix, e.g. ‘gene_round’
tile_roi (int) – index of the ROI
tile_row (int) – index of the row to process
max_col (int) – Maximum columns index. Column 0 to max_col will be projected.
overwrite (bool, optional) – whether to redo projection if files already exist. Defaults to False.
- iss_preprocess.pipeline.project.reproject_failed(data_path, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Re-project tiles that failed to project previously.
- Parameters:
data_path (str) – Relative path to data.
iss_preprocess.pipeline.register module¶
- iss_preprocess.pipeline.register.correct_hyb_shifts(data_path, prefix=None, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Use robust regression across tiles to correct shifts and angles for hybridisation rounds. Either processes a specific hybridisation round or all rounds.
- Parameters:
data_path (str) – Relative path to data.
prefix (str) – Directory prefix to use, e.g. “hybridisation_1_1”. If None, processes all hybridisation acquisitions.
- iss_preprocess.pipeline.register.correct_shifts_roi(data_path, roi_dims, prefix='genes_round', max_shift=500, min_tiles=0)¶
Use robust regression to correct shifts across tiles for a single ROI.
RANSAC regression is applied to shifts within and across channels using tile X and Y position as predictors. This will load the single_tile shifts and create the corrected shifts.
- Parameters:
data_path (str) – Relative path to data.
roi_dims (tuple) – Dimensions of the ROI to be processed, in (ROI_ID, Xtiles, Ytiles) format.
prefix (str, optional) – Directory prefix to use. Defaults to “genes_round”.
max_shift (int, optional) – Maximum shift to include tiles in RANSAC regression. Tiles with larger absolute shifts will not be included in the fit but will still have their corrected shifts estimated. Defaults to 500.
min_tiles (int, optional) – Minimum number of tiles to use for RANSAC regression, otherwise median is used.
- iss_preprocess.pipeline.register.correct_shifts_single_round_roi(data_path, roi_dims, prefix='hybridisation_1_1', max_shift=500, fit_angle=True, align_method=None, n_chans=None)¶
Use robust regression across tiles to correct shifts and angles for a single hybridisation round and ROI.
- Parameters:
data_path (str) – Relative path to data.
roi_dims (tuple) – Dimensions of the ROI to be processed, in (ROI_ID, Xtiles, Ytiles) format.
prefix (str, optional) – Prefix of the round to be processed. Defaults to “hybridisation_1_1”.
max_shift (int, optional) – Maximum shift to include tiles in RANSAC regression. Tiles with larger absolute shifts will not be included in the fit but will still have their corrected shifts estimated. Defaults to 500.
fit_angle (bool, optional) – Fit the angle with robust regression if True, otherwise takes the median. Defaults to True
align_method (str, optional) – Method to use for alignment. If None, will be read from ops. Defaults to None.
- Returns:
None
- iss_preprocess.pipeline.register.correct_shifts_to_ref(data_path, prefix, max_shift=None, fit_angle=False, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Use robust regression across tiles to correct shifts to reference acquisition
- Parameters:
data_path (str) – Relative path to data.
prefix (str) – Directory prefix to use, e.g. “genes_round”.
fit_angle (bool, optional) – Fit the angle with robust regression if True, otherwise takes the median. Defaults to False
- iss_preprocess.pipeline.register.estimate_shifts_by_coors(data_path, tile_coors=(0, 0, 0), prefix='genes_round', suffix='max')¶
Estimate shifts across channels and sequencing rounds using provided reference rotation angles and scale factors.
- Parameters:
data_path (str) – Relative path to data.
tile_coors (tuple, optional) – Coordinates of tile to register, in (ROI, X, Y) format. Defaults to (0, 0, 0).
prefix (str, optional) – Directory prefix to register. Defaults to “genes_round”.
suffix (str, optional) – Filename suffix specifying which z-projection to use. Defaults to “fstack”.
- iss_preprocess.pipeline.register.filter_ransac_shifts(data_path, prefix, roi_dims, max_residuals=10)¶
Filter shifts to use RANSAC shifts only if the initial shifts are off
This will load the single_tile and corrected shifts and create the best shifts
- Parameters:
data_path (str) – Relative path to data.
prefix (str) – Directory prefix to use, e.g. “genes_round”.
roi_dims (tuple) – Dimensions of the ROI to be processed, in (ROI_ID, Xtiles, Ytiles)
max_residuals (int, optional) – Threshold on residuals above which the RANSAC shifts are used. Defaults to 10.
- iss_preprocess.pipeline.register.load_and_register_raw_stack(data_path, prefix, tile_coors, corrected_shifts=None)¶
Load a raw stack and apply channel registration.
- Parameters:
data_path (str) – Relative path to data.
prefix (str) – Acquisition to load.
tile_coors (tuple) – (Roi, tileX, tileY) tuple
corrected_shifts (str, optional) – Shift correction method. Defaults to None.
- Returns:
A (X x Y x Nchannels) registered stack
- Return type:
numpy.ndarray
- iss_preprocess.pipeline.register.load_and_register_sequencing_tile(data_path, tile_coors=(1, 0, 0), prefix='genes_round', suffix='max', filter_r=(2, 4), correct_channels=False, corrected_shifts='best', correct_illumination=False, nrounds=7, specific_rounds=None)¶
Load sequencing tile and align channels. Optionally, filter, correct illumination and channel brightness.
- Parameters:
data_path (str) – Relative path to data.
tile_coors (tuple, options) – Coordinates of tile to load: ROI, Xpos, Ypos. Defaults to (1, 0, 0).
prefix (str, optional) – Prefix of the sequencing round. Defaults to “genes_round”.
suffix (str, optional) – Filename suffix corresponding to the z-projection to use. Defaults to “fstack”.
filter_r (tuple, optional) – Inner and out radius for the hanning filter. If False, stack is not filtered. Defaults to (2, 4).
correct_channels (bool or str, optional) – Whether to normalize channel brightness. If ‘round1_only’, normalise by round 1 correction factor, otherwise, if True use all norm_factors. Defaults to False.
corrected_shifts (str, optional) – Which shift to use. One of reference, single_tile, ransac, or best. Defaults to ‘best’.
correct_illumination (bool, optional) – Whether to correct vignetting. Defaults to False.
nrounds (int, optional) – Number of sequencing rounds to load. Used only if specific_rounds is None. Defaults to 7.
specific_rounds (list, optional) – if not None, specifies which rounds must be loaded and ignores nrounds. Defaults to None
- Returns:
X x Y x Nch x len(specific_rounds) or Nrounds image stack. numpy.ndarray: X x Y boolean mask, identifying bad pixels that we were not
imaged for all channels and rounds (due to registration offsets) and should be discarded during analysis.
- Return type:
numpy.ndarray
- iss_preprocess.pipeline.register.load_and_register_tile(data_path, tile_coors, prefix, filter_r=True, projection=None, zero_bad_pixels=False, correct_illumination=True)¶
Load one single tile
Load a tile of prefix with channels/rounds registered, apply illumination correction and filtering.
- Parameters:
data_path (str) – Relative path to data
tile_coors (tuple) – (Roi, tileX, tileY) tuple
prefix (str) – Acquisition to load. If genes_round or barcode_round will load all the rounds.
filter_r (bool, optional) – Apply filter on rounds data? Parameters will be read from ops. Default to True
projection (str, optional) – Projection to use. If None, will read from ops. Defaults to None
zero_bad_pixels (bool, optional) – Set bad pixels to zero. Defaults to False
correct_illumination (bool, optional) – Apply illumination correction. Defaults to True
- Returns:
A (X x Y x Nchannels x Nrounds) registered stack numpy.ndarray: X x Y boolean mask of bad pixels where data is missing after
registration
- Return type:
numpy.ndarray
- iss_preprocess.pipeline.register.merge_shifts(data_path, prefix, n_chans=4)¶
Merge shifts for all ROI/tiles into a single shift median shift
Useful if some of the registration failed and we want to use the same shift for all tiles
- Parameters:
data_path (str) – Relative path to data.
prefix (str) – Directory prefix to use, e.g. “hybridisation_1_1”.
n_chans (int, optional) – Number of channels to merge. Defaults to 4.
- iss_preprocess.pipeline.register.register_channels_by_pairs(channel_grouping, ops, ops_prefix, stack, reference_prefix, binarise_quantile, reference_tforms, debug=False)¶
Register channels for a single tile iteratively by group of channels
channel_grouping must be a list of list (of list ….). The inner most levels will be registered together, using the first channel of the list as reference. Then the upper level will be registered together. For instance, if channel_grouping = [[0, 1], [2, 3]], channels 0 and 1 will be registered together (ref=0), then channels 2 and 3 will be registered together (ref=2), and finally the two groups will be registered together (ref=0).
- Parameters:
channel_grouping (list) – List of list of channels to register together.
ops (dict) – Experiment metadata.
ops_prefix (str) – Prefix to use for ops, e.g. “genes”.
stack (np.array) – Image stack to register.
reference_prefix (str) – Prefix to load scale or initial matrix from.
binarise_quantile (float) – Quantile to binarise images before registration.
reference_tforms (dict) – Reference transformation parameters.
debug (bool) – Return debug information.
- Returns:
Transformation parameters. dict: Debug information, only if debug is True
- Return type:
dict
- iss_preprocess.pipeline.register.register_fluorescent_tile(data_path, tile_coors, prefix, reference_prefix=None, debug=False, save_output=True)¶
Estimate channel registration parameters for a single round acquisition
The stack will be binarised if ops[f”{prefix_start}_binarise_quantile”] is not None. The scale and initial parameters will be loaded from the reference prefix and optimised using either a similarity transform or an affine transform, depending on ops[“align_method”].
- Parameters:
data_path (str) – Relative path to data.
tile_coors (tuple) – Coordinates of tile to register, in (ROI, X, Y) format.
prefix (str) – Directory prefix to register. Defaults to
reference_prefix (str, optional) – Prefix to load scale or initial matrix from. Defaults to None.
debug (bool, optional) – Return debug information. Defaults to False.
save_output (bool, optional) – Save output to disk. Defaults to True.
- Returns:
Debug information if debug is True, None otherwise.
- Return type:
dict
- iss_preprocess.pipeline.register.run_correct_shifts(data_path, prefix, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Use robust regression to correct shifts across tiles within an ROI for all ROIs.
- Parameters:
data_path (str) – Relative path to data.
prefix (str) – Directory prefix to use, e.g. “genes_round”.
- iss_preprocess.pipeline.register.run_register_reference_tile(data_path, prefix='genes_round', diag=False, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Subfunction to run the registration of the reference tile
This function actually perform the computation. It performs the registration of the the reference tile specified inn the ops. This include shifts and rotations between rounds and shifts, rotations, and scaling between channels.
Shifts are estimated using phase correlation. Rotation and scaling are estimated using iterative grid search.
Results are saved in a npz file in the processed directory in: data_path / ‘reg’ / prefix / ‘ref_tile_tforms_`prefix`_round.npz’
- Parameters:
data_path (str) – Relative path to data.
prefix (str, optional) – Directory prefix to register. Defaults to “genes_round”.
diag (bool, optional) – Whether to save diagnostic plots.
iss_preprocess.pipeline.segment module¶
- iss_preprocess.pipeline.segment.add_mask_id(data_path, roi, masks, barcode_df=None, barcode_dot_threshold=0.15, spot_score_threshold=0.1, hyb_score_threshold=0.8, load_genes=True, load_hyb=True, load_barcodes=True)¶
Load gene, barcode, and hybridisation spots and add a mask_id column to each spots dataframe
- Parameters:
data_path (str) – Relative path to data
roi (int) – ID of the ROI to load
masks (np.array) – Array of labels.
barcode_df (pd.DataFrame, optional) – Rabies barcode dataframe, if None, will load “barcode_df_roi{roi}.pkl”. Defaults to None.
barcode_dot_threshold (float, optional) – Threshold for the barcode dot product. Only spots above the threshold will be counted. Defaults to 0.15.
spot_score_threshold (float, optional) – Threshold for the OMP score. Only spots above the threshold will be counted. Defaults to 0.1.
hyb_score_threshold (float, optional) – Threshold for hybridisation spots. Only spots above the threshold will be counted. Defaults to 0.8.
load_genes (bool, optional) – Whether to load gene spots. Defaults to True.
load_hyb (bool, optional) – Whether to load hybridisation spots. Defaults to True
load_barcodes (bool, optional) – Whether to load barcode spots. Defaults to True.
- Returns:
Dictionary of spots dataframes
- Return type:
dict
- iss_preprocess.pipeline.segment.filter_mcherry_cells(data_path, prefix, tile_list=None, use_rois=None, use_slurm=True, slurm_folder=None, job_dependency=None)¶
Use GMM to cluster cells and remove non-cell masks.
This function will: - Use all saved dataframe to fit a GMM model, calling _gmm_cluster_mcherry_cells - Apply this model to all masks, calling _remove_non_cell_masks
- Parameters:
data_path (str) – Relative path to the data.
prefix (str) – Prefix of the image stack.
tile_list (list, optional) – List of tiles to process. If None, will process all tiles. Defaults to None.
use_rois (list, optional) – List of ROIs to process. If None, will process all ROIs. Used only if tile_list is None. Defaults to None.
use_slurm (bool, optional) – Whether to use slurm to parallelize the process. Defaults to True.
slurm_folder (str, optional) – Folder to save slurm logs. Defaults to None.
job_dependency (list, optional) – List of job ids to wait for before starting the job. Defaults to None.
- iss_preprocess.pipeline.segment.find_edge_touching_masks(masks, border_width=4)¶
Finds masks that touch the edge of the image.
- Parameters:
masks (np.ndarray) – The binary or labeled mask array where each cell is represented by a unique integer, and background is 0.
border_width (int) – The width of the border to consider when checking for edge touching. Defaults to 4.
- Returns:
A list of unique labels that touch the edge of the image.
- Return type:
edge_touching_labels (list)
- iss_preprocess.pipeline.segment.get_big_masks(data_path, masks, mask_expansion)¶
Small internal function to avoid code duplication
Reload and expand masks if needed
- Parameters:
data_path (str) – Relative path to data
masks (np.array) – Array of labels.
mask_expansion (float, optional) – Distance in um to expand masks before counting rolonies per cells. None for no expansion. Defaults to 5.
- Returns:
masks expanded
- Return type:
numpy.ndarray
- iss_preprocess.pipeline.segment.get_cell_masks(data_path, roi, projection='corrected', mask_expansion=None, reload=True, prefix=None, curated=False)¶
Small wrapper to get cell masks from a given data path.
Wrap to ensure we use the same projection for all calls
- Parameters:
data_path (str) – Path to acquisition data (chamber folder)
roi (int) – Region of interest
projection (str, optional) – Projection to use. Defaults to “corrected”.
mask_expansion (int, optional) – Expansion of the mask. If None, reads from ops. Defaults to None.
reload (bool, optional) – If True, reload the saved masks, otherwise regenerate from individual tiles. Defaults to True.
prefix (str, optional) – Prefix to use for the masks. If None, reads from ops. Defaults to None.
curated (bool, optional) – Whether to use curated masks. These are manually curated and have the same filename with “_curated.tif”. Defaults to False.
- Returns:
Cell masks
- Return type:
np.ndarray
- iss_preprocess.pipeline.segment.get_overlap_regions(data_path, prefix, ref_coors)¶
Determine the coordinates of the overlap region between two adjacent tiles using explicit tile direction.
- Parameters:
shifts (dict) – The dictionary containing the shift values for the down and right tiles.
tile_ref (np.ndarray) – The reference tile.
tile_right (np.ndarray) – The right tile.
tile_down (np.ndarray) – The down tile.
tile_down_right (np.ndarray) – The down right tile.
- Returns:
- The overlap region between the reference tile and
the down tile.
- overlap_down (np.ndarray): The overlap region between the down tile and the
reference tile.
- overlap_ref_side (np.ndarray): The overlap region between the reference tile and
the right tile.
- overlap_right (np.ndarray): The overlap region between the right tile and the
reference tile.
- overlap_ref_with_down_right (np.ndarray): The overlap region between the
reference tile and the down right tile.
- overlap_down_right_with_ref (np.ndarray): The overlap region between the down
right tile and the reference tile.
- Return type:
overlap_ref_vert (np.ndarray)
- iss_preprocess.pipeline.segment.get_stack_for_cellpose(data_path, prefix, tile_coors, use_raw_stack=True)¶
Load the stack to segment with cellpose.
This will load a stack with 2 channels from the raw data or the registered stack.
- Parameters:
data_path (str) – Relative path to data.
prefix (str) – Acquisition prefix to use for segmentation.
tile_coors (tuple) – Coordinates of the tile to segment.
use_raw_stack (bool, optional) – Whether to use the raw stack or the projected stack. Defaults to True.
- Returns:
X x Y x channels (x Z) stack.
- Return type:
numpy.ndarray
- iss_preprocess.pipeline.segment.make_cell_dataframe(data_path, roi, masks=None, mask_expansion=None, atlas_size=10)¶
Make cell dataframe
The index will be the mask ID. The dataframe will include, for each cell, their centroid, bounding box, and area. If atlas_size is not None, it will also include the ID and acronym of the atlas area where their centroid is located.
- Parameters:
data_path (str) – Relative path to data
roi (int) – Number of the ROI to process
masks (np.array, optional) – Array of labels, if None will load masks_{roi}.npy from the reg folder. Defaults to None.
mask_expansion (float, optional) – Distance in um to expand masks before counting rolonies per cells. None for no expansion. Defaults to None.
atlas_size (int, optional) – Size of the atlas to use to load ARA information. If None, will not get area information. Defaults to 10.
- Returns:
Dataframe with the cell information
- Return type:
cell_df (pd.DataFrame)
- iss_preprocess.pipeline.segment.remove_all_duplicate_masks(data_path, prefix, upper_overlap_thresh=None, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Remove masks that overlap in adjacent tiles.
The within_acquisition registration must be run for prefix beforehand.
- Parameters:
data_path (str) – Relative path to the data.
prefix (str) – Prefix of the image stack.
upper_overlap_thresh (float, optional) – The upper threshold percentage for considering mask overlap significant. If None, will use ops if defined, 0.3 otherwise. Defaults to None.
- Returns:
- A list of tuples containing the labels that
overlapped and their respective percentages.
- Return type:
all_overlapping_pairs (list)
- iss_preprocess.pipeline.segment.run_cellpose_segmentation(data_path, prefix, roi=None, tx=None, ty=None, use_raw_stack=True, use_gpu=True, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
- iss_preprocess.pipeline.segment.run_mask_projection(data_path, prefix, roi=None, tx=None, ty=None, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Project masks to a single plane.
Wrapper around iss_preprocess.segment.cell.project_mask to run on slurm.
- Parameters:
data_path (str) – Relative path to data.
prefix (str) – Acquisition prefix to use for segmentation.
roi (int) – ROI ID to segment as specified in MicroManager (i.e. 1-based).
tx (int) – X coordinate of the tile.
ty (int) – Y coordinate of the tile.
- Returns:
X x Y x channels (x Z) stack.
- Return type:
numpy.ndarray
- iss_preprocess.pipeline.segment.save_curated_dataframes(data_path, prefix, intensity_channels=None, rois=None, mask_expansion=None, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Save the curated dataframes to the cells folder.
- Parameters:
data_path (str) – Relative path to the data.
prefix (str) – Prefix of the image stack.
roi (list, optional) – List of ROIs to process. If None, will process all ROIs. Defaults to None.
mask_expansion (int, optional) – Mask expansion to use. If None, will use the value from the ops. Defaults to None.
- Returns:
Dataframe with the cell information.
- Return type:
pd.DataFrame
- iss_preprocess.pipeline.segment.save_mcherry_mask_df(data_path, prefix)¶
Collate individual tile dataframes and remove overlapping masks.
This does not “stitch” the mask and keeps only the within tile x/y coordinates, but it does precompute the stitched label.
- Parameters:
data_path (str) – Relative path to the data.
prefix (str) – Prefix of the image stack.
- Returns:
Dataframe with the cell information.
- Return type:
pd.DataFrame
- iss_preprocess.pipeline.segment.save_unmixing_coefficients(data_path, prefix, tile_coors=None, background_channel=None, signal_channel=None, projection=None, seed=None, n_random=None, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Find the unmixing coefficients.
- Parameters:
data_path (str) – Path to the data directory.
prefix (str) – Prefix of the image stack.
tile_coors (list, optional) – List of tile coordinates. If None, will use random tiles. Defaults to None.
background_channel (int, optional) – Channel index of the background image. If None, will use the value from the ops. Defaults to None.
signal_channel (int, optional) – Channel index of the signal image. If None, will use the value from the ops. Defaults to None.
projection (str, optional) – Projection method. If None, will use the value from the ops. Defaults to None.
seed (int, optional) – Random seed for the random tiles. If None, will use the the value from the ops. Defaults to None.
n_random (int, optional) – Number of random tiles to use. If None, will use the the value from the ops. Defaults to None.
- Returns:
Pure signal image. coef (float): Unmixing coefficient. intercept (float): Unmixing intercept
- Return type:
pure_signal (np.ndarray)
- iss_preprocess.pipeline.segment.segment_all_rois(data_path, prefix='DAPI_1', use_gpu=False)¶
Start batch jobs for segmentation for each ROI.
- Parameters:
data_path (str) – Relative path to data.
prefix (str, optional) – acquisition prefix to use for segmentation. Defaults to “DAPI_1”.
use_gpu (bool, optional) – Whether to use GPU. Defaults to False.
- iss_preprocess.pipeline.segment.segment_all_tiles(data_path, prefix='DAPI_1', use_raw_stack=True, use_gpu=True, use_rois=None, tile_list=None, rerun_cellpose=False, use_slurm=True)¶
Start batch jobs for segmentation for each tile.
- Parameters:
data_path (str) – Relative path to data.
prefix (str, optional) – acquisition prefix to use for segmentation. Defaults to “DAPI_1”.
use_raw_stack (bool, optional) – Whether to use the raw stack and do 3d segmentation. Defaults to True.
use_gpu (bool, optional) – Whether to use GPU. Defaults to True.
use_rois (list, optional) – List of ROIs to process. If None, will use all ROIs. Defaults to None.
tile_list (list, optional) – List of tiles to process. If provided will ignore use_rois. If None, will use all tiles.
rerun_cellpose (bool, optional) – Whether to rerun cellpose even if the raw masks already exist (used only if use_raw_stack is True). Defaults to False.
use_slurm (bool, optional) – Whether to use slurm. Defaults to True.
- Returns:
List of job IDs for the slurm jobs.
- Return type:
list
- iss_preprocess.pipeline.segment.segment_mcherry_tile(data_path, prefix, roi, tilex, tiley)¶
Segment the mCherry channel of an image stack.
- Parameters:
data_path (str) – Path to the data directory.
prefix (str) – Prefix of the image stack.
roi (int) – Region of interest.
tilex (int) – X coordinate of the tile.
tiley (int) – Y coordinate of the tile.
- Returns:
Binary image of the filtered masks. filtered_df (pd.DataFrame): DataFrame of the filtered masks. rejected_masks (np.ndarray): Binary image of the rejected masks.
- Return type:
filtered_masks (np.ndarray)
- iss_preprocess.pipeline.segment.segment_roi(data_path, iroi, prefix='DAPI_1', use_gpu=False)¶
Detect cells in a single ROI using Cellpose.
Much faster with GPU but requires very amount of VRAM for large ROIs.
- Parameters:
data_path (str) – Relative path to data.
iroi (int) – ROI ID to segment as specified in MicroManager (i.e. 1-based).
prefix (str, optional) – Acquisition prefix to use for segmentation. Defaults to “DAPI_1”.
use_gpu (bool, optional) – Whether to use GPU. Defaults to False.
- iss_preprocess.pipeline.segment.segment_spots(data_path, roi, masks=None, barcode_df=None, barcode_dot_threshold=None, spot_score_threshold=0.1, hyb_score_threshold=0.8, load_genes=True, load_hyb=True, load_barcodes=True)¶
Count number of rolonies per cell for barcodes and genes.
Only rolonies above the relevant threshold will be counted. (Note that genes rolonies are already thresholded once after OMP).
Hybridisation and sequencing datasets will be fused.
Outputs are saved in the cells folder as f”genes_df_roi{roi}.pkl” and f”barcode_df_roi{roi}.pkl”
- Parameters:
data_path (str) – Relative path to data
roi (int) – ID of the ROI to load
masks (np.array, optional) – Array of labels. If None will load using “get_cell_masks”. Defaults to None.
barcode_df (pd.DataFrame, optional) – Rabies barcode dataframe, if None, will load “barcode_df_roi{roi}.pkl”. Defaults to None.
barcode_dot_threshold (float, optional) – Threshold for the barcode dot product. Only spots above the threshold will be counted. Defaults to 0.15.
spot_score_threshold (float, optional) – Threshold for the OMP score. Only spots above the threshold will be counted. Defaults to 0.1.
hyb_score_threshold (float, optional) – Threshold for hybridisation spots. Only spots above the threshold will be counted. Defaults to 0.8.
load_genes (bool, optional) – Whether to load gene spots. Defaults to True.
load_hyb (bool, optional) – Whether to load hybridisation spots. Defaults to True
load_barcodes (bool, optional) – Whether to load barcode spots. Defaults to True.
- Returns:
- Count of rolonies per barcode sequence per cell.
Index is the mask ID of the cell
- fused_df (pd.DataFrame): Count of rolonies per genes or hybridisation probe per
cell. Index is the mask ID of the cell
- Return type:
barcode_df (pd.DataFrame)
- iss_preprocess.pipeline.segment.unmix_tile(data_path, prefix, tile_coors, background_channel=None, signal_channel=None, projection=None)¶
Unmix one tile using the previously found coefficients.
- Parameters:
data_path (str) – Path to the data directory.
prefix (str) – Prefix of the image stack.
tile_coors (tuple) – Coordinates of the tile.
background_channel (int, optional) – Channel index of the background image.
signal_channel (int, optional) – Channel index of the signal image.
projection (str, optional) – Projection method.
- Returns:
Unmixed image.
- Return type:
unmixed (np.ndarray)
iss_preprocess.pipeline.sequencing module¶
- iss_preprocess.pipeline.sequencing.basecall_tile(data_path, tile_coors, save_spots=True)¶
Detect and basecall barcodes for a given tile.
- Parameters:
data_path (str) – Relative path to data.
tile_coors (tuple, optional) – Coordinates of tile to load: ROI, Xpos, Ypos.
save_spots (bool, optional) – Whether to save the detected spots. Used to run without erasing during diagnostics. Defaults to True.
- iss_preprocess.pipeline.sequencing.compute_spot_sign_image(data_path, prefix='genes_round')¶
Compute the reference spot sign image to use in spot calling. Save it to the processed data folder.
- Parameters:
data_path (str) – Relative path to data.
prefix (str, optional) – Prefix of the sequencing read to use. Defaults to “genes_round”.
- iss_preprocess.pipeline.sequencing.detect_genes_on_tile(data_path, tile_coors, save_stack=False, prefix='genes_round')¶
Apply the OMP algorithm to unmix spots in a given tile using the saved gene dictionary and settings saved in ops.yml. Then detect gene spots in the resulting gene maps.
- Parameters:
data_path (str) – Relative path to data.
tile_coors (tuple) – Coordinates of tile to load: ROI, Xpos, Ypos.
save_stack (bool, optional) – Whether to save registered and preprocessed images. Defaults to False.
prefix (str, optional) – Prefix of the sequencing read to analyse. Defaults to “genes_round”.
- iss_preprocess.pipeline.sequencing.estimate_channel_correction(data_path, prefix='genes_round', nrounds=7, fit_norm_factors=False, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Compute grayscale value distribution and normalisation factors
Each correction_tiles of ops is filtered before being used to compute the distribution of pixel values. Normalisation factor to equalise these distribution across channels and rounds are defined as ops[“correction_quantile”] of the distribution.
- Parameters:
data_path (str or Path) – Relative path to the data folder
prefix (str, optional) – Folder name prefix, before round number. Defaults to “genes_round”.
nrounds (int, optional) – Number of rounds. Defaults to 7.
- Returns:
- A 65536 x Nch x Nrounds distribution of grayscale values
for filtered stacks
norm_factors (np.array) A Nch x Nround array of normalisation factors
- Return type:
pixel_dist (np.array)
- iss_preprocess.pipeline.sequencing.get_reference_spots(data_path, prefix='genes')¶
Load the reference spots for the given dataset.
Internal function for setup_omp and setup_barcode_calling.
- Parameters:
data_path (str) – Relative path to data.
prefix (str, optional) – Short prefix, either ‘genes’ or ‘barcode’. Defaults to ‘genes’.
- Returns:
Detected spots. list: Normalisation shifts.
- Return type:
pandas.DataFrame
- iss_preprocess.pipeline.sequencing.load_spot_sign_image(data_path, threshold, return_raw_image=False)¶
Load the reference spot sign image to use in spot calling. First, check if the spot sign image has been computed for the current dataset and use it if available. Otherwise, use the spot sign image saved in the repo.
- Parameters:
data_path (str) – Relative path to data.
threshold (float) – Absolute value threshold used to binarize the spot sign image.
return_raw_image (bool, optional) – Whether to return the raw spot sign image. Defaults to False.
- Returns:
Spot sign image after thresholding, containing -1, 0, or 1s.
- Return type:
numpy.ndarray
- iss_preprocess.pipeline.sequencing.run_omp_on_tile(data_path, tile_coors, ops, save_stack=False, prefix='genes_round')¶
Run OMP on a tile and return the results.
- Parameters:
data_path (str) – Relative path to data.
tile_coors (tuple) – Coordinates of the tile to process.
ops (dict) – Dictionary of parameters.
save_stack (bool, optional) – Whether to save the registered stack. Defaults to False.
prefix (str, optional) – Prefix of the sequencing read to use. Defaults to “genes_round”.
- Returns:
OMP results. dict: Dictionary of OMP parameters.
- Return type:
numpy.ndarray
- iss_preprocess.pipeline.sequencing.setup_barcode_calling(data_path, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Detect spots and compute cluster means
- Parameters:
data_path (str) – Relative path to data
- Returns:
- A list with Nrounds elements. Each a Nch x Ncl (square
because N channels is equal to N clusters) array of cluster means, normalised by round 0 intensity
all_spots (pandas.DataFrame): All detected spots.
- Return type:
cluster_means (list)
- iss_preprocess.pipeline.sequencing.setup_omp(data_path, force_redo=False, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Prepare variables required to run the OMP algorithm. Finds isolated spots using STD across rounds and channels. Detected spots are then used to determine the bleedthrough matrix using scaled k-means.
- Parameters:
data_path (str) – Relative path to data.
force_redo (bool, optional) – Whether to redo the setup. Defaults to False.
- Returns:
- N x M dictionary, where N = R * C and M is the
number of genes.
list: gene names. float: norm shift for the OMP algorithm, estimated as median norm of all pixels.
- Return type:
numpy.ndarray
iss_preprocess.pipeline.stitch module¶
- iss_preprocess.pipeline.stitch.calculate_tile_positions(shift_right, shift_down, tile_shape, ntiles, x_direction, y_direction)¶
Calculate position of each tile based on the provided shifts.
- Parameters:
shift_right (numpy.array) – X and Y shifts between different columns. Either a 2-element array or a ntiles[0] x ntiles[1] x 2 matrix of shifts
shift_down (numpy.array) – X and Y shifts between different rows. Either a 2-element array or a ntiles[0] x ntiles[1] x 2 matrix of shifts
tile_shape (numpy.array) – shape of each tile
ntiles (numpy.array) – number of tile rows and columns
- Returns:
- tile_origins, ntiles[0] x ntiles[1] x 2 matrix of tile origin
coordinates
- numpy.ndarray: tile_centers, ntiles[0] x ntiles[1] x 2 matrix of tile center
coordinates
- Return type:
numpy.ndarray
- iss_preprocess.pipeline.stitch.find_tile_order(data_path, prefix=None, xy_stage_name='XYStage', z_stage_name='ZDrive', verbose=True)¶
Find the order of tiles in a multi-tile acquisition
- Parameters:
data_path (str) – Relative path to data
prefix (str, optional) – Acquisition prefix. If None, will use the one in ops. Defaults to None.
xy_stage_name (str, optional) – Name of the XY stage. Defaults to “XYStage”.
z_stage_name (str, optional) – Name of the Z stage. If None, will not load Z positions. Defaults to “ZDrive”.
verbose (bool, optional) – Print information about the number of tiles found. Defaults to True.
- Returns:
- Dictionary of tile order with tuple (roi, col, row) as key and acquisition
order (across all ROIs) as value.
pandas.DataFrame: DataFrame containing tile position information.
- Return type:
dict
- iss_preprocess.pipeline.stitch.find_tile_overlap(data_path, ref_prefix, tile_coor1, tile_coor2)¶
Find the overlap between two tiles
If tile1 is the stack, the overlap can be accessed by: tile1[overlap_tile_1[1]:overlap_tile_1[3], overlap_tile_1[0]:overlap_tile_1[2]]
- Parameters:
rect1 (tuple) – Rectangle coordinates (x0, y0, x1, y1)
rect2 (tuple) – Rectangle coordinates (x0, y0, x1, y1)
- Returns:
Overlap in global coordinates (x0, y0, x1, y1) tuple: Overlap in tile 1 (x0, y0, x1, y1) tuple: Overlap in tile 2 (x0, y0, x1, y1)
- Return type:
tuple
- iss_preprocess.pipeline.stitch.get_tform_to_ref(data_path, prefix, tile_coors, corrected_shifts=None)¶
Load the transformation to reference for a tile
- Parameters:
data_path (str) – Relative path to data
prefix (str) – Acquisition prefix
tile_coors (tuple) – (roi, tileX, tileY) tuple
corrected_shifts (str, optional) – Method used to correct shifts to reference. If None, will use the one in ops. Defaults to None.
- Returns:
A dictionary with the transformation parameters
- Return type:
np.array
- iss_preprocess.pipeline.stitch.get_tile_corners(data_path, prefix, roi)¶
Find the corners of all tiles for a roi
- Parameters:
data_path (str) – Relative path to data
prefix (str) – Acquisition prefix. For round-based acquisition, round 1 will be used
roi (int) – Roi ID
- Returns:
- tile_corners, ntiles[0] x ntiles[1] x 2 x 4 matrix of tile
corners coordinates. Corners are in this order: [(origin), (0, 1), (1, 1), (1, 0)]
- Return type:
numpy.ndarray
- iss_preprocess.pipeline.stitch.load_tile_ref_coors(data_path, tile_coors, prefix, filter_r=True, projection=None, correct_illumination=True)¶
Load one single tile in the reference coordinates
This load a tile of prefix with channels/rounds registered
- Parameters:
data_path (str) – Relative path to data
tile_coordinates (tuple) – (Roi, tileX, tileY) tuple
prefix (str) – Acquisition to load. If genes_round or barcode_round will load all the rounds.
filter_r (bool, optional) – Apply filter on rounds data? Parameters will be read from ops. Default to True
projection (str, optional) – Projection to load. If None, will use the one in ops. Default to None
correct_illumination (bool, optional) – Apply illumination correction. Default to True
- Returns:
A (X x Y x Nchannels x Nrounds) registered stack np.array: A (X x Y) boolean array of bad pixels that fall outside image after
registration
- Return type:
np.array
- iss_preprocess.pipeline.stitch.register_adjacent_tiles(data_path, ref_coors=None, ref_ch=0, suffix='max', prefix='genes_round_1_1', correct_illumination=False, overlap_ratio=0.01, verbose=True, debug=False)¶
Estimate shift between adjacent imaging tiles using phase correlation.
Shifts are typically very similar between different tiles, using shifts estimated using a reference tile for the whole acquisition works well.
- Parameters:
data_path (str) – path to image stacks.
ref_coors (tuple, optional) – coordinates of the reference tile to use for registration. Must not be along the bottom or right edge of image. If None use ops[‘ref_tile’]. Defaults to None.
ref_ch (int, optional) – reference channel used for registration. Defaults to 0.
suffix (str, optional) – File name suffix. Defaults to ‘proj’.
prefix (str, optional) – Full name of the acquisition folder
correct_illumination (bool, optional) – Remove black levels and correct illumination before registration if True, return raw data otherwise. Default to False
overlap_ratio (float, optional) – Minimum overlap between masks to consider the correlation results. Defaults to 0.01.
verbose (bool, optional) – If True, print warnings when shifts are large. Defaults to True.
debug (bool, optional) – Return additional information for debugging. Defaults to False.
- Returns:
shift_right, X and Y shifts between different columns numpy.array: shift_down, X and Y shifts between different rows numpy.array: shape of the tile
- Return type:
numpy.array
- iss_preprocess.pipeline.stitch.register_all_rois_within(data_path, prefix=None, ref_ch=None, suffix='max-median', correct_illumination=True, roi2use=None, reload=False, save_plot=True, dimension_prefix=None, verbose=1, use_slurm=True, job_dependency=None, scripts_name=None, slurm_folder=None)¶
Register all tiles within each ROI
- Parameters:
data_path (str) – Relative path to data
prefix (str, optional) – Prefix of acquisition to register. If None, will use the one in ops. Defaults to None.
ref_ch (int, optional) – Reference channel to use for registration. If None, will use the one in ops. Defaults to None.
suffix (str, optional) – Suffix to use to load the images. Defaults to ‘max-median’.
correct_illumination (bool, optional) – Correct illumination before registration. Defaults to True.
roi2use (list, optional) – List of ROI to use. If None or empty, will process all ROIs. Defaults to None
reload (bool, optional) – Reload saved shifts if True. Defaults to False.
save_plot (bool, optional) – Save diagnostic plot. Defaults to True.
dimension_prefix (str, optional) – Prefix to use to find ROI dimension. Used only if the acquisition is an overview. Defaults to ‘reference_prefix’.
verbose (int, optional) – Verbosity level. Defaults to 1.
use_slurm (bool, optional) – Use SLURM to parallelize the registration. Defaults to True.
job_dependency (list, optional) – List of job dependencies. Defaults to None.
script_names (str, optional) – Script names for slurm jobs. Defaults to None.
slurm_folder (str, optional) – Folder to save SLURM logs. Defaults to None.
- Returns:
List of outputs from register_within_acquisition
- Return type:
list
- iss_preprocess.pipeline.stitch.register_within_acquisition(data_path, roi, prefix=None, ref_ch=None, suffix='max', correct_illumination=False, reload=True, save_plot=False, dimension_prefix='genes_round_1_1', min_corrcoef=0.6, max_delta_shift=20, verbose=2, raise_on_empty_line=False, *, use_slurm=False, dependency_type=None, job_dependency=None, slurm_folder=None, scripts_name=None, slurm_options=None, batch_param_names=None, batch_param_list=None)¶
Estimate shifts between all adjacent tiles of an roi
Saves shifts as reg/f”{prefix}_within”/f”{prefix}_{roi}_shifts.npz”
- Parameters:
data_path (str) – path to image stacks.
roi (int) – id of ROI to load.
prefix (str, optional) – Full name of the acquisition folder.
ref_ch (int, optional) – reference channel used for registration. Defaults to 0.
suffix (str, optional) – File name suffix. Defaults to ‘proj’.
correct_illumination (bool, optional) – Remove black levels and correct illumination before registration if True, return raw data otherwise. Default to False
reload (bool, optional) – If target file already exists, reload instead of recomputing. Defaults to True
save_plot (bool, optional) – If True save diagnostic plot. Defaults to False
dimension_prefix (str, optional) – Prefix to use to find ROI dimension. Used only if the acquisition is an overview. Defaults to ‘genes_round_1_1’
min_corrcoef (float, optional) – Minimum correlation coefficient to consider a shift as valid. Defaults to 0.6.
max_delta_shift (int, optional) – Maximum shift, relative to median of the row or column, to consider a shift as valid. Defaults to 20.
verbose (int, optional) – Verbosity level. Defaults to 2.
raise_on_empty_line (bool, optional) – Raise an error if a row or a column has no valid shifts. If False, replace by the global median. Defaults to True
- Returns:
dictionary containing the shifts, tile shape and number of tiles
- Return type:
dict
- iss_preprocess.pipeline.stitch.stitch_and_register(data_path, target_prefix, reference_prefix=None, roi=1, downsample=3, ref_ch=0, target_ch=0, estimate_scale=False, estimate_rotation=True, target_projection=None, use_masked_correlation=False, debug=False)¶
Stitch target and reference stacks and align target to reference
To speed up registration, images are downsampled before estimating registration parameters. These parameters are then applied to the full scale image.
The reference stack always use the “projection” from ops as suffix. The target uses the same by default but that can be specified with target_suffix
This does not use ops[‘max_shift_rounds’].
- Parameters:
data_path (str) – Relative path to data.
reference_prefix (str) – Acquisition prefix to register the stitched image to. Typically, “genes_round_1_1”.
target_prefix (str) – Acquisition prefix to register.
roi (int, optional) – ROI ID to register (as specified in MicroManager). Defaults to 1.
downsample (int, optional) – Downsample factor for estimating registration parameter. Defaults to 5.
ref_ch (int, optional) – Channel of the reference image used for registration. Defaults to 0.
target_ch (int, optional) – Channel of the target image used for registration. Defaults to 0.
estimate_scale (bool, optional) – Whether to estimate scaling between target and reference images. Defaults to False.
estimate_rotation (bool, optional) – Whether to estimate rotation between target and reference images. Defaults to True.
target_suffix (str, optional) – Suffix to use for target stack. If None, will use the value from ops. Defaults to None.
use_masked_correlation (bool, optional) – Use masked correlation for registration Defaults to False.
debug (bool, optional) – If True, return full xcorr. Defaults to False.
- Returns:
Stitched target image after registration. numpy.ndarray: Stitched reference image. float: Estimate rotation angle. tuple: Estimated X and Y shifts. float: Estimated scaling factor. dict: Debug information if debug is True.
- Return type:
numpy.ndarray
- iss_preprocess.pipeline.stitch.stitch_registered(data_path, prefix, roi, channels=0, ref_prefix=None, filter_r=False, projection=None, correct_illumination=True)¶
Load registered stack and stitch them
The output is in the reference coordinate.
- Parameters:
data_path (str) – Relative path to data
prefix (str) – Prefix of acquisition to stitch
roi (int) – Roi ID
channels (list or int, optional) – Channel id(s). Defaults to 0.
ref_prefix (str, optional) – Prefix of reference acquisition to load shifts. If None, load from ops. Defaults to None.
filter_r (bool, optional) – Filter image before stitching? Defaults to False.
projection (str, optional) – Projection to load. If None, will use the one in ops. Default to None
correct_illumination (bool, optional) – Correct illumination before stitching. Defaults to True.
- Returns:
stitched stack
- Return type:
np.array
- iss_preprocess.pipeline.stitch.stitch_tiles(data_path, prefix, roi=1, suffix='max', ich=0, correct_illumination=False, shifts_prefix=None, register_channels=True, allow_quick_estimate=False, filter_r=False)¶
Load and stitch tile images using saved tile shifts.
This will load the tile shifts saved by register_within_acquisition
- Parameters:
data_path (str) – path to image stacks.
prefix (str) – prefix specifying which images to load, e.g. ‘round_01_1’
roi (int, optional) – id of ROI to load. Defaults to 1.
suffix (str, optional) – filename suffix. Defaults to ‘fstack’.
ich (int, optional) – index of the channel to stitch. Defaults to 0.
correct_illumination (bool, optional) – Remove black levels and correct illumination if True, return raw data otherwise. Default to False
shifts_prefix (str, optional) – prefix to use to load tile shifts. If not provided, use prefix. Defaults to None.
register_channels (bool, optional) – If True, register channels before stitching. Defaults to True.
allow_quick_estimate (bool, optional) – If True, will estimate shifts from a single tile if shifts.npz is not found. Defaults to False.
- Returns:
stitched image.
- Return type:
numpy.ndarray
- iss_preprocess.pipeline.stitch.warp_stack_to_ref(stack, data_path, prefix, tile_coors, interpolation=1, bad_pixels=None)¶
Warp a stack to the reference coordinates
- Parameters:
stack (np.array) – A (X x Y x Nchannels x Nrounds) stack
data_path (str) – Relative path to data
prefix (str) – Acquisition to use to find registration parameters
tile_coors (tuple) – (Roi, tileX, tileY) tuple
interpolation (int, optional) – Interpolation order. Defaults to 1.
bad_pixels (np.array, optional) – A (X x Y) boolean array of bad pixels that fall outside image after registration. If None, will not apply any mask. Defaults to None.
- Returns:
A (X x Y x Nchannels x Nrounds) registered stack np.array: A (X x Y) boolean array of bad pixels that fall outside image after
registration
- Return type:
np.array