Measuring features in a time-lapse dataset

Measuring features in a time-lapse dataset#

In this notebook, we will develop a workflow on a single timepoint of a timelapse and will learn how to apply the workflow on the whole time-lapse. This involves the following steps: preoprocessing, segmentation and feature extraction.

from skimage.io import imread
import pyclesperanto_prototype as cle
from skimage.measure import regionprops_table
import numpy as np
import pandas as pd

Developing a workflow#

First, we are reading in the image, scaling it and converting it to a numpy-array:

# reading in the image
image_stack = imread("../../data/CalibZAPWfixed_000154_max.tif")[::10,:200,:200]

# scaling the image and converting it to a np-array
image_stack = np.asarray(cle.scale(image_stack, factor_z=1, factor_y=6, factor_x=6, auto_size=True))

image_stack.shape

(10, 1200, 1200)

Next, we define a function segment_single_image which does background subtraction using top_hat_box and segmentation using voronoi_otsu_labeling.

Therefore, it needs as input:

an image
bg_radius (= background radius)
spot_sigma and outline_sigma.

It returns the labels as numpy-array:

def segment_single_image(image, bg_radius=50, spot_sigma=10, outline_sigma=3):
    background_subtracted = cle.top_hat_box(image, radius_x=bg_radius, radius_y=bg_radius)
    labels = cle.voronoi_otsu_labeling(background_subtracted, spot_sigma=spot_sigma, outline_sigma=outline_sigma)
    return np.asarray(labels)

We try the function on timepoint 6 (= 5th frame as we are counting from 0):

tp_6 = segment_single_image(image_stack[5])
tp_6

array([[  0,   0,   0, ...,   0,   0,   0],
       [  0,   0,   0, ...,   0,   0,   0],
       [  0,   0,   0, ...,   0,   0,   0],
       ...,
       [  0,   0,   0, ..., 177, 177, 177],
       [  0,   0,   0, ..., 177, 177, 177],
       [  0,   0,   0, ..., 177, 177, 177]], dtype=uint32)

Visualizing images and image stacks#

We can visualize our numpy-array like this:

cle.imshow(tp_6, labels=True)

../_images/6f48dc80fbda344f5e2b25e80a0eece8d2e67e658b25a523cf43c453dd2635cb.png

Now we can run our function in a for-loop over the whole time-lapse dataset:

segmented_slices = [segment_single_image(image) for image in image_stack]
segmented_stack = np.asarray(segmented_slices)

Our segmented_stack should keep the same shape as the original image_stack:

segmented_stack.shape

(10, 1200, 1200)

It does! Now, we can use stackview to visualize our whole image_stack. The function curtain provides a slider to visualize the label image on top of the original image.

import stackview
stackview.curtain(image_stack, segmented_stack, zoom_factor=0.3)

If you move the Slice slider you can see the segmentation result of the individual timepoints. As you can see, objects are not keeping the same label number over several frames. We are not doing tracking here, just segmenting timepoint-by-timepoint.

Measuring features on an image and a label image#

Our goal is to extract features from this time-lapse dataset. Therefore, we define another function analyze_mean_intensity_single_timepoint. The name says it all.

def analyze_mean_intensity_single_timepoint(image, labels, frame):
    df = pd.DataFrame(regionprops_table(labels[frame], intensity_image=image[frame], properties=['mean_intensity']))
    df['frame'] = frame
    return df

Now, we try it out on timepoint 6 of our time-lapse dataset:

df = analyze_mean_intensity_single_timepoint(image_stack, segmented_stack, 5)

df

	mean_intensity	frame
0	193.721710	5
1	154.925858	5
2	142.510788	5
3	102.849998	5
4	88.570175	5
...	...	...
172	154.884216	5
173	147.652908	5
174	126.156464	5
175	89.161293	5
176	135.688721	5

177 rows × 2 columns

As we can see, the function returns a table with the mean intensities of the labels in a particular frame.

Measuring features on the whole time-lapse dataset#

Next, we want to apply our function on the whole stack. We use a for-loop to do so:

num_frames = image_stack.shape[0]
all_frames_df = pd.concat([analyze_mean_intensity_single_timepoint(image_stack, segmented_stack, t) for t in range(num_frames)])
all_frames_df

	mean_intensity	frame
0	223.347107	0
1	93.042557	0
2	112.046814	0
3	192.766434	0
4	162.704330	0
...	...	...
335	135.748642	9
336	116.306427	9
337	121.075317	9
338	98.173592	9
339	82.390472	9

1702 rows × 2 columns

Now, we want to measure how long it takes to execute our code snipped. Therefore, we use timeit.

import timeit

%%timeit
pd.concat([analyze_mean_intensity_single_timepoint(image_stack, segmented_stack, t) for t in range(num_frames)])

115 ms ± 2.77 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

It tells us the execution time per loop (here per image) in our time-lapse dataset.

Short side note: What is a good way to write a function?#

If we define functions, one function should do only one thing. Then, this function can be called by another function which brings another novelty. We call this modularization. Have a look at the two functionsget_intensity_for_timepoint and get_intensity:

def get_intensity_for_timepoint(intensity_image, label_layer):
    stats = regionprops_table(label_layer, intensity_image=intensity_image, properties=['mean_intensity'])
    return stats['mean_intensity']

def get_intensity(intensity_image_stack, labels_layer_stack):
    result = []        
    for intensity_image, label_layer in zip(intensity_image_stack, labels_layer_stack):
        result.append(get_intensity_for_timepoint(intensity_image, label_layer))
    return result

The first function measures the mean_intensity and the second function applies this on a whole time-lapse dataset. Therefore, the functions both have one main task and the first function is nested in the second one.

Libraries to extract features of all timepoints#

Now, we want to measure different intensity-based parameters on all timepoints. Therefore, we use regionprops_table_all_frames from napari_skimage_regionprops.

from napari_skimage_regionprops import regionprops_table_all_frames

stats = regionprops_table_all_frames(image_stack[:,np.newaxis,:,:], 
                             segmented_stack[:,np.newaxis,:,:],
                             size=False,
                             intensity=True)
pd.DataFrame(stats)

analyzing frame 0
analyzing frame 1
analyzing frame 2
analyzing frame 3
analyzing frame 4
analyzing frame 5
analyzing frame 6
analyzing frame 7
analyzing frame 8
analyzing frame 9

	label	max_intensity	mean_intensity	min_intensity	standard_deviation_intensity	frame
0	1	255.0	223.347107	91.0	40.516113	0
1	2	97.0	93.042557	70.0	7.657447	0
2	3	154.0	112.046814	68.0	16.508696	0
3	4	245.0	192.766434	85.0	39.013126	0
4	5	255.0	162.704330	67.0	42.194302	0
...	...	...	...	...	...	...
1697	336	189.0	135.748642	61.0	30.539036	9
1698	337	173.0	116.306427	38.0	35.460144	9
1699	338	183.0	121.075317	39.0	38.362331	9
1700	339	133.0	98.173592	55.0	22.237257	9
1701	340	110.0	82.390472	48.0	16.561571	9

1702 rows × 6 columns

You could also use label_statistics_in_all_frames from napari-simpleitk-image-processing which provides some additional intensity-based parameters:

from napari_simpleitk_image_processing._simpleitk_image_processing import label_statistics_in_all_frames

stats = label_statistics_in_all_frames(image_stack[:,np.newaxis,:,:], 
                             segmented_stack[:,np.newaxis,:,:],
                             size=False,
                             intensity=True)
pd.DataFrame(stats)

analyzing frame 0
analyzing frame 1
analyzing frame 2
analyzing frame 3
analyzing frame 4
analyzing frame 5
analyzing frame 6
analyzing frame 7
analyzing frame 8
analyzing frame 9

	label	maximum	mean	median	minimum	sigma	sum	variance	frame
0	1	255.0	223.347113	248.525391	91.0	40.523332	626712.0	1642.140432	0
1	2	97.0	93.042553	97.119141	70.0	7.740232	4373.0	59.911193	0
2	3	154.0	112.046817	110.068359	68.0	16.513038	213001.0	272.680439	0
3	4	245.0	192.766438	201.708984	85.0	39.019508	589287.0	1522.522001	0
4	5	255.0	162.704329	162.861328	67.0	42.199321	684009.0	1780.782651	0
...	...	...	...	...	...	...	...	...	...
1697	336	189.0	135.748642	140.314453	61.0	30.544961	349960.0	932.994614	9
1698	337	173.0	116.306430	115.048828	38.0	35.472278	170040.0	1258.282491	9
1699	338	183.0	121.075317	123.169922	39.0	38.376643	162362.0	1472.766711	9
1700	339	133.0	98.173594	97.904297	55.0	22.264490	40153.0	495.707536	9
1701	340	110.0	82.390476	84.369141	48.0	16.601144	17302.0	275.597995	9

1702 rows × 9 columns

If you are interested in the differences between the libraries, you can go to feature_extraction.