{ "cells": [ { "cell_type": "markdown", "id": "500b07b7-5f43-40c0-ba80-bc6cd759f9f4", "metadata": {}, "source": [ "# Tiled image processing, a quick run-through\n", "\n", "In this notebook we will process a big dataset that has been saved in zarr format to count cells in individual tiles using [dask](https://docs.dask.org/en/stable/) and [zarr](https://zarr.readthedocs.io/en/stable/). The underlying principles will be explained in the next sections." ] }, { "cell_type": "code", "execution_count": 1, "id": "e6a9300d-1f11-4a3b-94bb-a136ba69f09d", "metadata": {}, "outputs": [], "source": [ "import zarr\n", "import dask.array as da\n", "import numpy as np\n", "from skimage.io import imread\n", "import pyclesperanto_prototype as cle\n", "from pyclesperanto_prototype import imshow\n", "from numcodecs import Blosc" ] }, { "cell_type": "markdown", "id": "8959f8d4-a6d6-4a2d-b4b7-9378d2ceec01", "metadata": {}, "source": [ "For demonstration purposes, we use a dataset that is provided by Theresa Suckert, OncoRay, University Hospital Carl Gustav Carus, TU Dresden. The dataset is licensed [License: CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/). We are using a cropped version here that was resaved a 8-bit image to be able to provide it with the notebook. You find the full size 16-bit image in CZI file format [online](https://zenodo.org/record/4276076#.YX1F-55BxaQ). The biological background is explained in [Suckert et al. 2020](https://www.sciencedirect.com/science/article/abs/pii/S0167814020301043), where we also applied a similar workflow. \n", "\n", "When working with big data, you will likely have an image stored in the right format to begin with. For demonstration purposes, we save here a test image into the zarr format, which is commonly used for handling big image data." ] }, { "cell_type": "code", "execution_count": 2, "id": "cc2eeeb8-eb5e-49fc-8569-cdff5e143e5e", "metadata": {}, "outputs": [], "source": [ "# Resave a test image into tiled zarr format\n", "input_filename = '../../data/P1_H_C3H_M004_17-cropped.tif'\n", "zarr_filename = '../../data/P1_H_C3H_M004_17-cropped.zarr'\n", "image = imread(input_filename)[1]\n", "compressor = Blosc(cname='zstd', clevel=3, shuffle=Blosc.BITSHUFFLE)\n", "zarray = zarr.array(image, chunks=(100, 100), compressor=compressor)\n", "zarr.convenience.save(zarr_filename, zarray)" ] }, { "cell_type": "markdown", "id": "d76246fe-7358-4e0c-8112-1f1fd0af4108", "metadata": {}, "source": [ "## Loading the zarr-backed image\n", "Dask brings built-in support for the zarr file format. We can create dask arrays directly from a zarr file." ] }, { "cell_type": "code", "execution_count": 3, "id": "2132d10e-1ec5-43eb-9c3c-a4d9358919cc", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n",
"
| \n",
" \n", " \n", " | \n", "
\n",
"
| \n",
" \n", " \n", " | \n", "