Counting cell neighbors in tissues#

When measuring properties of cells in the context of their neighbors, it makes a lot of sense to simulate tissues. For the 2D case, thise can be done using clesperanto’s build-in function artificial_tissue_2d()

import pyclesperanto_prototype as cle
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import math
cle.artificial_tissue_2d()
cle._ image
shape(256, 256)
dtypeuint32
size256.0 kB
min1.0
max176.0

As you can see, the simulated 2D tissue shows the common honeycomb-pattern of cells in tissues with some random variation. The amount of randomness can be controlled by changing the sigma value of the Gaussian distribution of the local shift of cell centers in pixel unit.

cle.artificial_tissue_2d(random_sigma_x=0, random_sigma_y=0)
cle._ image
shape(256, 256)
dtypeuint32
size256.0 kB
min1.0
max176.0

Furthermore, the elongation of the cells can be modified by changing the step size of cell center positions in pixel unit.

cle.artificial_tissue_2d(delta_y=40)
cle._ image
shape(256, 256)
dtypeuint32
size256.0 kB
min1.0
max77.0

Using simulations of tissues for validating neighbor count measurements#

A common task in quantitative tissue imaging analysis is counting the number of neighbors. There are multiple algorithms available for that which may, or may not, be invariant against the random variation shown above. Here we use two:

  • We will count the number of cells that are directly touching in at least one pixel.

  • We will count the number of cells with centroid positions in a given radius.

We can use both methods from clesperanto to visualize the neighbor-count in color:

labels = cle.artificial_tissue_2d()
cle.touching_neighbor_count_map(labels)
cle._ image
shape(256, 256)
dtypefloat32
size256.0 kB
min1.0
max8.0
radius = 25

cle.proximal_neighbor_count_map(labels, max_distance=radius)
cle._ image
shape(256, 256)
dtypefloat32
size256.0 kB
min1.0
max7.0

From these two views, we can see that the algorithms differ quite a bit in counting the number of neighbors locally.

Varying parameters#

In order to investigate how much the algorithms suffer from randomly varying position of the cells, we can apply the two methods in a for-loop varying the sigma parameters introduced above.

Before doing that we define a helper function for visualizing the results.

def measure_and_visualize_neighbor_count(labels):
    # measure
    touching_neighbor_count_map = cle.touching_neighbor_count_map(labels)
    radius = 25
    proximal_neighbor_count_map = cle.proximal_neighbor_count_map(labels, max_distance=radius)
    
    # visualize
    fix, ax = plt.subplots(1, 3, figsize=[10,10])
    cle.imshow(labels, labels=True, plot=ax[0])
    cle.imshow(touching_neighbor_count_map, 
               plot=ax[1], 
               min_display_intensity=0, 
               max_display_intensity=10,
               colormap='rainbow')
    cle.imshow(proximal_neighbor_count_map, 
               plot=ax[2], 
               min_display_intensity=0, 
               max_display_intensity=10,
               colormap='rainbow')
measure_and_visualize_neighbor_count(labels)
../_images/f63f7e4842d48d29ad2703ed618d31c2d0fb0ec2f7bbfdf7c72a5b7c0da158f9.png

Now we can program the for-loop that varies the sigma.

for sigma in range(0, 20, 1):
    cells = cle.artificial_tissue_2d(
        random_sigma_x=sigma, 
        random_sigma_y=sigma)
    
    measure_and_visualize_neighbor_count(cells)
../_images/42d7078fe6f3df913ef11a20a9a643fdc5ce07697720474f492d932f5d3d9cb1.png ../_images/a8c69491fbfbadf6f687e914b3425e4f55d6240ff5a92898fb12523a2719e47a.png ../_images/796fdd506a96a6e1fa66fab883a741a022b528b9d87d0e0a72128a6180bb5534.png ../_images/5b1b0b3497fd1ea90abf3dbee900123ec3b0910f800dbabf27c0fdc00aa79f68.png ../_images/616283f8f5b3a4b46d20302c0a7c0f55e8c5b3da80a9d79a582cc46cfc9e09f3.png ../_images/e3ae10ed3616e54de72aae5df1a6059a60ec1d852accb7e872cebbdb4726193b.png ../_images/c6b4cae514c1c0371e596589d40fd798c8558eddf01a0ddf392c7115ecd39b01.png ../_images/3081ac5c9593747312bee0a914b84230e4b0f9903e76fb3fc8aca0847bf3083f.png ../_images/60f89909afa21c0fede095383887bda6aaf35df2cc39b74d79b35fbe1d1596c8.png ../_images/a43c2b8410add7317c4e7930cb54eb48781161685af5dfe96742deed0ee10e4a.png ../_images/48795d04782e48d8b2f7768fcf2a3abc09ebd41e950b24f0bc2250005e1aa277.png ../_images/e163625a8e4419a37589bb2e0fce97fbc7be0f6fc60c61654acde360ca4377e3.png ../_images/45983ff62c60036466bbcb7a1d8e926186f9fd648bffe515db961379b4ca37a1.png ../_images/8dda34e8c881087a6f1aff4eaf02c2b644f69ec823f34aa02138dbcec51add8d.png ../_images/f10a96e027a7d6a01c104787cf441760b096853603ca248b02ca123c1a3d9327.png ../_images/f469139000f72252628cd96baa190dd08ae85b15a06f3c4a9b4767cf96401683.png ../_images/021344575388f7ba83cacb65a57b73a324231168cbfb46d5036c69158b8b6846.png ../_images/f43d220401732f6bb2a3018e6cd87640800861405bb1973b89f6430abaf42abc.png ../_images/33c966f1e486fc8689d362be64b9f4b44214540030bf26629eecc8d7b50191d4.png ../_images/0ae316d31f6dbf25ebfefe2e9e30b7a91a5a65b17813ab3914900b769d0c3efb.png

This visualization can be interpreted as follows:

  • The touching neighbor count appears to sense more neighbors touching, even though the number of cells per area tissue remains constant.

  • The proximal neighbor count has local fluctuations in the neighbor count which on average might cancel out.

We can further investiate these statements by quantitatively measuring the neighbor count and the standard deviation. We again write a helper function similar to the one above but this time it measures the average neighbor count. Additionally, we will count the total number of cells as additional quality check. The helper function will return our measurements as dictionary so that we can conveniently collect them in a pandas DataFrame.

def measure_neighbor_count(labels):
    # measure
    touching_neighbor_count_map = cle.touching_neighbor_count_map(labels)
    radius = 25
    proximal_neighbor_count_map = cle.proximal_neighbor_count_map(labels, max_distance=radius)
    
    # we should not measure number of neighbors in 
    # cells that touch the image border.
    labels_excluding_border = cle.exclude_labels_on_edges(labels)

    touching_neighbor_count_list = cle.read_intensities_from_map(labels_excluding_border, touching_neighbor_count_map)
    proximal_neighbor_count_list = cle.read_intensities_from_map(labels_excluding_border, proximal_neighbor_count_map)

    return {
        "number_of_cells" : [labels_excluding_border.max()],
        "touching_neighbor_count_mean" : [np.mean(touching_neighbor_count_list)],
        "touching_neighbor_count_std" : [np.std(touching_neighbor_count_list)],
        "proximal_neighbor_count_mean" : [np.mean(proximal_neighbor_count_list)],
        "proximal_neighbor_count_std" : [np.std(proximal_neighbor_count_list)]
    }
pd.DataFrame(measure_neighbor_count(labels)).T
0
number_of_cells 130.000000
touching_neighbor_count_mean 5.908397
touching_neighbor_count_std 0.823791
proximal_neighbor_count_mean 5.190840
proximal_neighbor_count_std 0.782582

We can now run our for-loop again, collect those values and plot them.

statistics_over_sigma = None

for sigma in range(0, 20, 1):
    cells = cle.artificial_tissue_2d(
        random_sigma_x=sigma, 
        random_sigma_y=sigma)
    
    measurements = measure_neighbor_count(cells)
    measurements["sigma"] = sigma
    statistics = pd.DataFrame(measurements)
    
    if statistics_over_sigma is None:
        statistics_over_sigma = pd.DataFrame(statistics)
    else:
        statistics_over_sigma = pd.concat([statistics_over_sigma, statistics])

statistics_over_sigma
number_of_cells touching_neighbor_count_mean touching_neighbor_count_std proximal_neighbor_count_mean proximal_neighbor_count_std sigma
0 126.0 5.984252 0.176771 5.984252 0.176771 0
0 129.0 5.961538 0.228838 5.692307 0.552563 1
0 129.0 5.961538 0.228838 5.453846 0.621556 2
0 128.0 5.891473 0.759848 5.201550 0.892517 3
0 126.0 5.968504 0.762683 4.976378 0.846154 4
0 130.0 5.786260 1.229458 4.404580 1.075605 5
0 125.0 5.928571 1.229134 4.317461 0.939865 6
0 127.0 5.703125 1.480917 4.429688 1.315078 7
0 129.0 5.761539 1.507947 4.230769 1.224986 8
0 121.0 5.754098 1.784778 3.975410 1.217786 9
0 132.0 5.631579 1.749215 4.075188 1.509987 10
0 130.0 5.557252 1.769807 4.022901 1.570071 11
0 132.0 5.631579 1.642818 4.210526 1.387661 12
0 131.0 5.772727 1.607704 4.515152 1.607204 13
0 132.0 5.368421 1.986699 3.879699 1.631631 14
0 129.0 5.684616 1.696307 4.423077 1.718073 15
0 131.0 5.598485 1.623034 4.234848 1.926135 16
0 136.0 5.532847 1.868100 4.343066 1.658589 17
0 130.0 5.282443 2.151778 3.946565 1.899429 18
0 129.0 5.438461 1.976803 4.007692 1.610046 19

First, let’s plot the neighbor count together with the standard error as error bar.

plt.errorbar(x=statistics_over_sigma["sigma"], 
             y=statistics_over_sigma["touching_neighbor_count_mean"],
             yerr=statistics_over_sigma["touching_neighbor_count_std"] / 
                  np.sqrt(statistics_over_sigma["number_of_cells"])
            )
plt.errorbar(x=statistics_over_sigma["sigma"], 
             y=statistics_over_sigma["proximal_neighbor_count_mean"],
             yerr=statistics_over_sigma["proximal_neighbor_count_std"] / 
                  np.sqrt(statistics_over_sigma["number_of_cells"])
            )
plt.legend(["Touching neighbor count", "Proximal_neighbor_count"])
plt.xlabel("sigma")
plt.ylabel("neighbor count")
Text(0, 0.5, 'neighbor count')
../_images/0b54e0535e3ae7770801baaa75395e92f669653f2f5c106bde5990797e5c3c3f.png

Using this visualization, we could argue that the number of touching neighbors is the more stable measurement in the presence of varying cell positions. Also note that the tissues shown above might not be the best representation of biological tissues.

Last but not least a sanity check if the number of cells in the image was more or less independent from the sigma.

plt.scatter(x=statistics_over_sigma["sigma"],
            y=statistics_over_sigma["number_of_cells"])
plt.xlabel("sigma")
plt.ylabel("Cell count")
Text(0, 0.5, 'Cell count')
../_images/6db543b6a818ce7f76eec13d51fbd759fe5173cae6612d7141a9e6968998ba77.png
np.mean(statistics_over_sigma["number_of_cells"]), np.std(statistics_over_sigma["number_of_cells"])
(129.1, 3.0805843)

Exercise#

Repeat the experiment shown above with cells that are elongated along x. If the conclusion is different, can you explain why?