Opening large CZI files with AICSImageIO

Opening large CZI files with AICSImageIO#

When working with large microscopy images (e.g. multiple GB to TB per file), the data may exceed your memory capacity, potentially causing the kernel to crash. In this notebook, we will use the AICSImageIO library integrated with Dask to handle large .czi files efficiently.

⚠️ Note: AICSImageIO is no longer actively maintained. If you run into issues while following this notebook, please try its compatible successor BioIO.

Requirements#

Before getting started, ensure that you have the necessary libraries for handling bioimage formats and .czi files installed. If not, please run these two commands in your command line:

pip install aicsimageio  
pip install aicspylibczi 

Downloading Data#

We will use the mouse brain image Demo LISH 4x8 15pct 647.czi from Nicolas Chiaruttini, licensed under CC BY 4.0 and store it in a local directory.

First, let’s confirm that the folder where we want to store the data exists - and create it if it does not.

from aicsimageio import AICSImage
import os
import urllib.request

def ensure_folder_exists(folder_path):
    if not os.path.exists(folder_path):
        os.makedirs(folder_path)

folder_path = "../../data/"

ensure_folder_exists(folder_path)

Next, we will check if the file has already been saved in the directory. If not, we will download it.

def ensure_file_exists(file_path, url):
    if not os.path.isfile(file_path):
        try: 
            print(f"Please wait. Downloading file from {url} to {file_path}...")
            urllib.request.urlretrieve(url, file_path)
            print("Download complete.")
        except Exception as e: 
            print(f"Download failed: {e}")
    else:
        print("File already exists.")

filename = "Demo LISH 4x8 15pct 647.czi"
file_path = os.path.join(folder_path, filename)
url = "https://zenodo.org/records/8305531/files/Demo%20LISH%204x8%2015pct%20647.czi?download=1"

ensure_file_exists(file_path, url)

Please wait. Downloading file from https://zenodo.org/records/8305531/files/Demo%20LISH%204x8%2015pct%20647.czi?download=1 to ../../data/Demo LISH 4x8 15pct 647.czi...
Download complete.

⚠️ Note: The file is 4.5 GB large. Downloading can take a while.

Reading large CZI files#

After downloading the image, we will use AICSImage to create an object that gives us access to the metadata without immediately loading the full image into memory. At this stage, we are using the same command we would for smaller files - just like in this notebook.

image = AICSImage(file_path)

Next, we will make use of Dask (a parallel computing library for big data analytics) to inspect the size and structure of the image. Again, no actual image data is loaded into memory yet.

This so-called lazy loading approach delays the reading of a large file until you actually need it (e.g. for calculation, visualization). It allows you to get an overview of the data first.

As you can see below, Dask also enables chunk processing. This means that the data is split into smaller blocks (i.e. chunks), which can be processed one piece at a time. This independent and parallel processing is useful when your data is larger than your memory.

image.dask_data

	Array	Chunk
Bytes	3.24 GiB	127.84 MiB
Shape	(1, 1, 56, 6254, 4969)	(1, 1, 56, 1094, 1094)
Dask graph	30 chunks in 1795 graph layers
Data type	uint16 numpy.ndarray

image.dims

<Dimensions [T: 1, C: 1, Z: 56, Y: 6254, X: 4969]>

Now we can specify which dimensions of the image we would like to work with and in which order the dimensions should be arranged. Remember:

T: time points
C: channel
Z: depth
Y: height
X: width

In this example, we will use all dimensions in the order "CZYX". In addition, we will reduce the image resolution by selecting every 5th pixel along the X, Y, and Z axes. This helps reduce memory usage and processing time.

Make sure to use get_image_dask_data instead of get_image_data - only then Dask will automatically handle the lazy loading and chunked processing for us.

image_reduced = image.get_image_dask_data(
    "CZYX", 
    T = 0,
    X = slice(0, -1, 5), 
    Y = slice(0, -1, 5), 
    Z = slice(0, -1, 5))

Now, we will call .compute to load the selected part of data into memory.

brain_image = image_reduced.compute()

⚠️ Note: The last command will likely run a little longer, as image data is being moved into memory.

In case you experience a kernel crash, try the following options:

Further reduce your data (e.g. select every 50th pixel instead of every 5th pixel, select only one z-plane, etc.)
Free up memory (see the last section of this notebook).

Visualizing#

Let’s display our image with stackview. You can also use napari if you prefer a more interactive viewer, as shown here.

import stackview 

stackview.insight(brain_image)

shape	(1, 11, 1251, 994)
dtype	uint16
size	26.1 MB
min	164
max	65534

Freeing up memory#

Before adding (new) data into memory, it can be helpful to free up space by deleting variables you are no longer using with del. You can also use Python’s built-in garbage collector gc to manually force a cleanup of unreachable objects. This reduces the risk of memory overload.

⚠️ Note: Be cautious when deleting variables - once deleted, you will need to re-run the earlier code to generate them again.

del brain_image

import gc
gc.collect()