Distance-based colocalization#

A common bio-image analysis task is counting cells and other objects depending on their existence in multiple channels and distances between these objects. When analyzing objects according to distances to object in other channels, one can also count different objects which express signal in different channels. If objects are in different places close-by, they are technically not colocalized, but the analysis shown here also work for these use-cases.

import pyclesperanto_prototype as cle
import numpy as np
from skimage.io import imread, imshow
import matplotlib.pyplot as plt

<Intel(R) Iris(R) Xe Graphics on Platform: Intel(R) OpenCL HD Graphics (1 refs)>

We’re using a dataset published by Heriche et al. licensed CC BY 4.0 available in the Image Data Resource.

# load file
raw_image = imread('../../data/plate1_1_013 [Well 5, Field 1 (Spot 5)].png')

# pixel size (from meta data)
pixel_size_xy_in_micron = 0.32

# visualize
<matplotlib.image.AxesImage at 0x1f65e1a1850>

First, we need to split channels (read more). After that, we can actually see that not all cells marked with Cy3 (channel 0) are also marked with eGFP (channel 1):

# extract channels
channel_0 = raw_image[...,0]
channel_1 = raw_image[...,1]

# visualize
fig, axs = plt.subplots(1, 2, figsize=(15,15))
axs[0].imshow(channel_0, cmap='gray')
axs[1].imshow(channel_1, cmap='gray')
<matplotlib.image.AxesImage at 0x1f65e2009a0>

Segmenting nuclei#

As the staining marks the whole nucleus in both cases, it is reasonable to segmentn nuclei in both images and then process the segmented images further. We use Voronoi-Otsu-Labeling for the segmentation because it is a quick and straightforward approach.

# segmentation
nuclei_cy3 = cle.voronoi_otsu_labeling(channel_0, spot_sigma=20)
nuclei_egfp = cle.voronoi_otsu_labeling(channel_1, spot_sigma=20)

# visualize
fig, axs = plt.subplots(1, 2, figsize=(15,15))
cle.imshow(nuclei_cy3, plot=axs[0], labels=True)
cle.imshow(nuclei_egfp, plot=axs[1], labels=True)

The above shown label images have inside nuclei pixel intensity values that correspond to the number of the nucleus. In nucleus 1, all pixels have intensity 1. In nucleus 2, all pixels have intensity 2 and so on. We can then reduce these images to their centroids to later determine distances between objects in multiple channels.

centroids_cy3 = cle.centroids_of_labels(nuclei_cy3)
centroids_egfp = cle.centroids_of_labels(nuclei_egfp)

These centroids_... variables are coordinate lists. Hence, if we print out their shape, we can already note that there are more cy3 positive nuclei than egfp postives:

print(centroids_cy3.shape, centroids_egfp.shape)
(2, 31) (2, 23)

Distance matrix#

We compute now the distance of all centroids_cy3 versus all centroids_egfp. Note the shape of the distance matrix tells us that all cy3 positives lie on the x-axis and egfp-positive centroids on the y-axis:

distance_matrix = cle.generate_distance_matrix(centroids_cy3, centroids_egfp)
cle._ image
shape(24, 32)
size3.0 kB

This matrix can be thresholded to count the number of positive objects which are in a given radius of objects in the other channel. Therefore, we fix a threshold taking the pixel size into account.

maximum_distance_in_micron = 5
maximum_distnace_in_pixels = maximum_distance_in_micron / pixel_size_xy_in_micron
binary_neighbor_count_matrix = distance_matrix < maximum_distnace_in_pixels

# we do not want to count background as a close-by object
cle.set_column(binary_neighbor_count_matrix, 0, 0)
cle.set_row(binary_neighbor_count_matrix, 0, 0)

cle._ image
shape(24, 32)
size768.0 B

Counting neighbors#

We can now count for each cy3-positive nucleus the number of objects that are also egfp-positive by projecting this matrix in y:

positive_cy3_double_positive = np.asarray(cle.maximum_y_projection(binary_neighbor_count_matrix))[0, 1:]
# [0, 1:] is necessary to get rid of the first column which corresponds to background
array([1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 0., 1., 0., 1., 1., 0., 0., 0., 0., 0., 1.],

There are that many cy3 positives that are also egfp positive:


There are that many cy3 positives that are egfp negative

(positive_cy3_double_positive == 0).sum()

Analogously, we can count the egfp positives that are also cy3 positive using a maximum-projection in x:

positive_egfp_double_positive = np.asarray(cle.maximum_x_projection(binary_neighbor_count_matrix))[1:,0]
# [1:, 0] is necessary to get rid of the first column which corresponds to background
array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1.], dtype=float32)

There are that many egfp positives that are also cy3 positive:


And there are that many egfp positives that are cy3 negative:

(positive_egfp_double_positive == 0).sum()