{ "cells": [ { "cell_type": "markdown", "id": "500b07b7-5f43-40c0-ba80-bc6cd759f9f4", "metadata": {}, "source": [ "# Map area of objects in tiles\n", "\n", "In this notebook, we will segment nuclei in tiles and measure their area. We will then save the resulting are area map again as tiles in a zarr file. This strategy can be used to process data that as a whole does not fit in computer memory." ] }, { "cell_type": "code", "execution_count": 1, "id": "e6a9300d-1f11-4a3b-94bb-a136ba69f09d", "metadata": {}, "outputs": [], "source": [ "import zarr\n", "import dask.array as da\n", "import numpy as np\n", "from skimage.io import imread\n", "import pyclesperanto_prototype as cle\n", "from pyclesperanto_prototype import imshow\n", "from numcodecs import Blosc" ] }, { "cell_type": "markdown", "id": "8959f8d4-a6d6-4a2d-b4b7-9378d2ceec01", "metadata": {}, "source": [ "For demonstration purposes, we use a dataset that is provided by Theresa Suckert, OncoRay, University Hospital Carl Gustav Carus, TU Dresden. The dataset is licensed [License: CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/). We are using a cropped version here that was resaved a 8-bit image to be able to provide it with the notebook. You find the full size 16-bit image in CZI file format [online](https://zenodo.org/record/4276076#.YX1F-55BxaQ)." ] }, { "cell_type": "code", "execution_count": 2, "id": "cc2eeeb8-eb5e-49fc-8569-cdff5e143e5e", "metadata": {}, "outputs": [], "source": [ "image = imread('../../data/P1_H_C3H_M004_17-cropped.tif')[1]\n", "\n", "# for testing purposes, we crop the image even more.\n", "# comment out the following line to run on the whole 5000x2000 pixels\n", "image = image[1000:1500, 1000:1500]\n", "\n", "#compress AND change the numpy array into a zarr array\n", "compressor = Blosc(cname='zstd', clevel=3, shuffle=Blosc.BITSHUFFLE)\n", "\n", "# Convert image into zarr array\n", "chunk_size = (100, 100)\n", "zarray = zarr.array(image, chunks=chunk_size, compressor=compressor)\n", "\n", "# save zarr to disk\n", "zarr_filename = '../../data/P1_H_C3H_M004_17-cropped.zarr'\n", "zarr.convenience.save(zarr_filename, zarray)" ] }, { "cell_type": "markdown", "id": "d76246fe-7358-4e0c-8112-1f1fd0af4108", "metadata": {}, "source": [ "## Object area maps in tiles\n", "Dask brings built-in support for the zarr file format. We can create dask arrays directly from a zarr file." ] }, { "cell_type": "code", "execution_count": 3, "id": "2132d10e-1ec5-43eb-9c3c-a4d9358919cc", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", "
\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Array Chunk
Bytes 244.14 kiB 9.77 kiB
Shape (500, 500) (100, 100)
Count 26 Tasks 25 Chunks
Type uint8 numpy.ndarray
\n", "
\n", " \n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", " 500\n", " 500\n", "\n", "
" ], "text/plain": [ "dask.array" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "zarr_image = da.from_zarr(zarr_filename)\n", "zarr_image" ] }, { "cell_type": "markdown", "id": "c2721aa7-947e-4855-9325-c3e2b4746226", "metadata": {}, "source": [ "We can apply image processing to this tiled dataset directly." ] }, { "cell_type": "code", "execution_count": 4, "id": "cba0b2e9-c7ac-43dc-b5b3-b0aadb43b425", "metadata": {}, "outputs": [], "source": [ "def area_map(image):\n", " \"\"\"\n", " Label objects in a binary image and produce a pixel-count-map image.\n", " \"\"\"\n", " print(\"Processing image of size\", image.shape)\n", " \n", " labels = cle.voronoi_otsu_labeling(image, spot_sigma=3.5)\n", " result = cle.pixel_count_map(labels)\n", " \n", " print(result.shape)\n", " \n", " return np.asarray(result)" ] }, { "cell_type": "markdown", "id": "40c716df-6dbf-4b94-a1e6-a090d142a395", "metadata": {}, "source": [ "## Testing tiled image processing\n", "We should test our area mapping algorithm on a single tile. Actually, in a real scenario, the image processing workflow is developed on individual tiles, e.g. in a notebook like this one. As soon as we are sure that the algorithm works, we can apply it to all tiles." ] }, { "cell_type": "code", "execution_count": 5, "id": "697eb25f-e546-48e2-a744-a63e8db189bb", "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Processing image of size (100, 100)\n", "(100, 100)\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "test_image = image[100:200,100:200]\n", "\n", "imshow(test_image)\n", "\n", "test_result = area_map(test_image)\n", "\n", "imshow(test_result, colorbar=True)" ] }, { "cell_type": "markdown", "id": "06b8f44e-d521-4d74-9f57-465c447b6d20", "metadata": {}, "source": [ "## Applying the tiled image processing to a zarr-backed dataset\n", "\n", "Applying the function to our zarr dataset will also result in a dask array." ] }, { "cell_type": "code", "execution_count": 6, "id": "84842017-8acd-466b-b344-5c5b0dc2082c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Processing image of size (0, 0)\n", "Processing image of size (1, 1)\n", "(1, 1)\n", "Processing image of size (0, 0)\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", "
\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Array Chunk
Bytes 0.95 MiB 39.06 kiB
Shape (500, 500) (100, 100)
Count 632 Tasks 25 Chunks
Type float32 numpy.ndarray
\n", "
\n", " \n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", " 500\n", " 500\n", "\n", "
" ], "text/plain": [ "dask.array<_trim, shape=(500, 500), dtype=float32, chunksize=(100, 100), chunktype=numpy.ndarray>" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "overlap_width = 30\n", "\n", "tile_map = da.map_overlap(area_map, zarr_image, depth=overlap_width, boundary=0)\n", "\n", "tile_map" ] }, { "cell_type": "markdown", "id": "2743263e-87d7-4f1e-98e4-1d91001a26db", "metadata": {}, "source": [ "Before we can start the computation, we need to deactivate asynchronous execution of operations in pyclesperanto. [See also related issue](https://github.com/clEsperanto/pyclesperanto_prototype/issues/163)." ] }, { "cell_type": "code", "execution_count": 7, "id": "62da18c5-4023-45ea-8c5f-9ec47588e3ef", "metadata": {}, "outputs": [], "source": [ "cle.set_wait_for_kernel_finish(True)" ] }, { "cell_type": "markdown", "id": "95b44846-7b72-4ed2-96cc-11c03e16c800", "metadata": {}, "source": [ "When we invoke saving the results to disk, the processing will happen on individual tiles." ] }, { "cell_type": "code", "execution_count": 8, "id": "4166a08a-b7be-4bf5-b861-c166779f796f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Processing image of size (160, 160)\n", "Processing image of size (160, 160)\n", "Processing image of size (160, 160)\n", "Processing image of size (160, 160)\n", "Processing image of size (160, 160)\n", "Processing image of size Processing image of size (160, 160)\n", "(160, 160)\n", "Processing image of size (160, 160)\n", "Processing image of size (160, 160)\n", "Processing image of size (160, 160)\n", "(160, 160)\n", "Processing image of size (160, 160)\n", "(160, 160)\n", "(160, 160)\n", "(160, 160)\n", "Processing image of size Processing image of size (160, 160)\n", "(160, 160)\n", "(160, 160)\n", "(160, 160)\n", "(160, 160)\n", "Processing image of size (160, 160)\n", "Processing image of size(160, 160) \n", "(160, 160)\n", "Processing image of size (160, 160)\n", "(160, 160)\n", "Processing image of size (160, 160)\n", "Processing image of size (160, 160)\n", "Processing image of size (160, 160)\n", "(160, 160)\n", "Processing image of size (160, 160)\n", "(160, 160)\n", "Processing image of size (160, 160)\n", "(160, 160)\n", "(160, 160)\n", "(160, 160)\n", "(160, 160)Processing image of sizeProcessing image of size(160, 160)\n", " (160, 160)\n", " \n", "(160, 160)\n", "Processing image of size (160, 160)\n", "Processing image of size (160, 160)\n", "(160, 160)\n", "(160, 160)\n", "(160, 160)\n", "(160, 160)\n", "(160, 160)\n", "(160, 160)\n", "(160, 160)\n", "(160, 160)\n", "(160, 160)\n" ] } ], "source": [ "processed_zarr_filename = '../../data/P1_H_C3H_M004_17-processed.zarr'\n", "\n", "tile_map.to_zarr(processed_zarr_filename, overwrite=True)" ] }, { "cell_type": "markdown", "id": "0d2ef602-739d-4496-8279-29ad87587b54", "metadata": {}, "source": [ "## Loading zarr\n", "Just for demonstration purposes, we will load the zarr backed tiled image and visualize it. When working with big data, this step might not be possible." ] }, { "cell_type": "code", "execution_count": 9, "id": "a3cc5730-0ce5-4b62-95cc-e75bdd4f334e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", "
\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Array Chunk
Bytes 0.95 MiB 39.06 kiB
Shape (500, 500) (100, 100)
Count 26 Tasks 25 Chunks
Type float32 numpy.ndarray
\n", "
\n", " \n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", " 500\n", " 500\n", "\n", "
" ], "text/plain": [ "dask.array" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "zarr_result = da.from_zarr(processed_zarr_filename)\n", "zarr_result" ] }, { "cell_type": "code", "execution_count": 10, "id": "d46694cc-4f11-4960-a1cc-d0248916e567", "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "result = zarr_result.compute()\n", "\n", "cle.imshow(result)" ] }, { "cell_type": "code", "execution_count": null, "id": "b567b6ad-7307-42d2-924d-54caf1ec4396", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.0" } }, "nbformat": 4, "nbformat_minor": 5 }