Tiling images - the naive approach

Contents

Tiling images - the naive approach#

In tiled image processing, the first step is cutting the image into tiles. While this could be done with numpy, we will use dask because it comes with multiple very useful features in this context.

import dask
import dask.array as da
from skimage.filters import gaussian
from skimage.data import cells3d
from pyclesperanto_prototype import imshow

In the first example we will use an image showing nuclei from fluorescence microscopy and just denoise the image using a Gaussian blur. We will do this tile-by-tile. For that, we first define the procedure that should be applied to all tiles. We build in a print statement into this function to when it’s executed and how large the image is that is being processed.

def procedure(image):
    print("proceduring", image.shape)
    return gaussian(image, sigma=5)
image = cells3d()[30,1]
imshow(image)
../_images/e6d2393f7c605c769fb82b772552ade1566917e2a294f57aa9e4f747b0c78793.png

After loading the image, it can be tiled like that. In dask, tiles are also called chunks.

tiles = da.from_array(image, chunks=(128, 128))
tiles
Array Chunk
Bytes 128.00 kiB 32.00 kiB
Shape (256, 256) (128, 128)
Count 4 Tasks 4 Chunks
Type uint16 numpy.ndarray
256 256

Next, we tell dask what to do with our tiles. We want to map the function procedure on all individual tiles. Note, this does not process the whole image yet.

tile_map = da.map_blocks(procedure, tiles)
proceduring (0, 0)
proceduring (1, 1)

As we can read, the function was executed twice with very small images (0x0 and 1x1 pixels). Dask does that in principle to explore if the function works. Next, we will actually execute our procedure on the tiles of the image.

result = tile_map.compute() # Warning: This loads all image data into memory
proceduringproceduring (128, 128)
 (128, 128)
proceduring (128, 128)
proceduring (128, 128)

The printed output looks a bit chaotic because dask executed the procedure on multiple tiles in parallel. If we inspect the result, we will see it is again an image.

result.shape
(256, 256)
type(result)
numpy.ndarray

Note: The imshow function may not work on big datasets. We are using it here for demonstration purposes.

imshow(result)
../_images/7b0a8ba73291ddde763b05183cfdf2a3d3557bede55f4d5dd6dd35a5022f1c2e.png

Border effects#

When processing images tile-by-tile we always must assume that along the border artifacts appear that result from cutting the image into tiles. As our example image fits in memory, we can apply procedure to it and compare it to the result from the tiled image processing

untiled_result = procedure(image)
imshow(untiled_result)
proceduring (256, 256)
../_images/b31924a9f3cf423d048676a3c5ccca6648eb37f59f08ecc030e068593386819d.png

The differences are not obvious, but we can visualize them.

difference = result - untiled_result
imshow(difference)
../_images/e4126918c50e4b551767159a05ace9b17cb60db4891f34ce4000369b3cfe9c54.png

When applying a Gaussian blur with a small sigma, these effects may be negligible. In case the effects cause severe issues in our image processing workflow, we may want to reduce or even prevent those artifacts.