Counting nuclei in tiles#
In this notebook we will process a big dataset that has been saved in zarr format to count cells in individual tiles. For every tile, we will write a pixel in an output image. Hence, we are producing a cell-count image that is smaller than the original image by a factor that corresponds to the tile size.
import zarr
import dask.array as da
import numpy as np
from skimage.io import imread
import pyclesperanto_prototype as cle
from pyclesperanto_prototype import imshow
from numcodecs import Blosc
For demonstration purposes, we use a dataset that is provided by Theresa Suckert, OncoRay, University Hospital Carl Gustav Carus, TU Dresden. The dataset is licensed License: CC-BY 4.0. We are using a cropped version here that was resaved a 8-bit image to be able to provide it with the notebook. You find the full size 16-bit image in CZI file format online.
image = imread('../../data/P1_H_C3H_M004_17-cropped.tif')[1]
# for testing purposes, we crop the image even more.
# comment out the following line to run on the whole 5000x2000 pixels
image = image[1000:1500, 1000:1500]
#compress AND change the numpy array into a zarr array
compressor = Blosc(cname='zstd', clevel=3, shuffle=Blosc.BITSHUFFLE)
# Convert image into zarr array
chunk_size = (100, 100)
zarray = zarr.array(image, chunks=chunk_size, compressor=compressor)
# save zarr to disk
zarr_filename = '../../data/P1_H_C3H_M004_17-cropped.zarr'
zarr.convenience.save(zarr_filename, zarray)
Loading the zarr-backed image#
Dask brings built-in support for the zarr file format. We can create dask arrays directly from a zarr file.
zarr_image = da.from_zarr(zarr_filename)
zarr_image
|
We can apply image processing to this tiled dataset directly.
Counting nuclei#
For counting the nuclei, we setup a simple image processing workflow that applies Voronoi-Otsu-Labeling to the dataset. Afterwards, we count the segmented objects. As nuclei might be counted twice which touch the tile border, we have to correct the count for every tile. Technically, we could remove the objects which touch one of the vertical or horizontal tile borders. However, there is a simpler way for correcting for this error: We count the number of nuclei after segmentation. Then, we remove all nuclei which touch any image border and count the remaining nuclei again. We can then assume that half of the removed nuclei should be counted. Hence, we add the two counts, before and after edge-removal, and compute the average of these two measurements. Especially on large tiles with many nuclei, the remaining error should be negligible. It is not recommended to apply such an estimating cell counting method when each tile contains only few nuclei.
def count_nuclei(image):
"""
Label objects in a binary image and produce a pixel-count-map image.
"""
print("Processing image of size", image.shape)
# Count nuclei including those which touch the image border
labels = cle.voronoi_otsu_labeling(image, spot_sigma=3.5)
label_intensity_map = cle.mean_intensity_map(image, labels)
high_intensity_labels = cle.exclude_labels_with_map_values_within_range(label_intensity_map, labels, maximum_value_range=20)
nuclei_count = high_intensity_labels.max()
# Count nuclei excluding those which touch the image border
labels_without_borders = cle.exclude_labels_on_edges(high_intensity_labels)
nuclei_count_excluding_borders = labels_without_borders.max()
# Both nuclei-count including and excluding nuclei at image borders
# are no good approximation. We should exclude the nuclei only on
# half of the borders to get a good estimate.
# Alternatively, we just take the average of both counts.
result = np.asarray([[(nuclei_count + nuclei_count_excluding_borders) / 2]])
print(result.shape)
return result
Before we can start the computation, we need to deactivate asynchronous execution of operations in pyclesperanto. See also related issue.
cle.set_wait_for_kernel_finish(True)
This time, we do not use tile overlap, because we are not measuring properties of the nuclei and thus, don’t need a prefect segmentation of them.
tile_map = da.map_blocks(count_nuclei, zarr_image)
tile_map
Processing image of size (0, 0)
Processing image of size (1, 1)
(1, 1)
Processing image of size (0, 0)
|
As the result image is much smaller then the original, we can compute the whole result map.
result = tile_map.compute()
Processing image of size (100, 100)
Processing image of sizeProcessing image of size (100, 100)
Processing image of size (100, 100)
(100, 100)
Processing image of size (100, 100)
Processing image of size (100, 100)
Processing image of sizeProcessing image of size (100, 100)
Processing image of size(100, 100)
(100, 100)
Processing image of size (100, 100)
(1, 1)
(1, 1)
Processing image of size (100, 100)
(1, 1)
Processing image of size (100, 100)
Processing image of size (100, 100)
(1, 1)(1, 1)
Processing image of size (100, 100)
Processing image of size(1, 1)
(100, 100)
Processing image of size (100, 100)
(1, 1)
(1, 1)
Processing image of size (100, 100)
Processing image of size (100, 100)
(1, 1)
Processing image of size (100, 100)
(1, 1)
Processing image of size (100, 100)
(1, 1)
Processing image of size (100, 100)
(1, 1)
(1, 1)
Processing image of sizeProcessing image of size (100, 100)
(100, 100)
(1, 1)(1, 1)
(1, 1)
Processing image of size Processing image of size(100, 100) (1, 1)
(100, 100)
(1, 1)
(1, 1)
(1, 1)
(1, 1)
(1, 1)
(1, 1)
(1, 1)
(1, 1)
result.shape
(5, 5)
Again, as the result map is small, we can just visualize it.
cle.imshow(result, colorbar=True)
With a quick visual check in the original image, we can see that indeed in the bottom left corner of the image, there are more cells than in the top right.
cle.imshow(cle.voronoi_otsu_labeling(image, spot_sigma=3.5), labels=True)