zip
for Processing Paired Folders#
In this notebook, we will use the Python built-in function zip
to iterate over paired folders of images and label masks. Specifically, we will process images and their corresponding masks from the following directories:
data/BBBC007/images
data/BBBC007/masks
We’ll calculate the average intensity of labeled objects and the number of objects in each pair of image and mask files, and store the results in a pandas DataFrame.
import os
import pandas as pd
from skimage import io, measure
import numpy as np
# Define paths
image_folder = '../../data/BBBC007/images'
mask_folder = '../../data/BBBC007/masks'
Before starting, we just have a look at the folder contents to see if there are indeed paired files.
image_files = sorted(os.listdir(image_folder))
image_files
['A9 p5d (cropped 1).tif',
'A9 p5d (cropped 2).tif',
'A9 p5d (cropped 3).tif',
'A9 p5d (cropped 4).tif']
mask_files = sorted(os.listdir(mask_folder))
mask_files
['A9 p5d (cropped 1).tif',
'A9 p5d (cropped 2).tif',
'A9 p5d (cropped 3).tif',
'A9 p5d (cropped 4).tif']
df = pd.DataFrame(columns=['Image', 'Average Intensity', 'Number of Objects'])
To demonstrate how zip()
allows iterate over image and mask files in parallel, we just print out file names in a short for-loop:
for image_file, mask_file in zip(image_files, mask_files):
image_path = os.path.join(image_folder, image_file)
mask_path = os.path.join(mask_folder, mask_file)
print(image_path, mask_path, "\n\n")
../../data/BBBC007/images\A9 p5d (cropped 1).tif ../../data/BBBC007/masks\A9 p5d (cropped 1).tif
../../data/BBBC007/images\A9 p5d (cropped 2).tif ../../data/BBBC007/masks\A9 p5d (cropped 2).tif
../../data/BBBC007/images\A9 p5d (cropped 3).tif ../../data/BBBC007/masks\A9 p5d (cropped 3).tif
../../data/BBBC007/images\A9 p5d (cropped 4).tif ../../data/BBBC007/masks\A9 p5d (cropped 4).tif
The same code can be used to go through both folders in parallel and analyse intensity images paired with given label images.
for image_file, mask_file in zip(image_files, mask_files):
image_path = os.path.join(image_folder, image_file)
mask_path = os.path.join(mask_folder, mask_file)
# Read the image and its mask
image = io.imread(image_path)
mask = io.imread(mask_path).astype(np.uint32)
# Measure labeled regions
labeled_regions = measure.regionprops(mask, intensity_image=image)
# Calculate average intensity and number of objects
num_objects = len(labeled_regions)
avg_intensity = sum(region.mean_intensity for region in labeled_regions) / num_objects
# Append results for the current pair
df.loc[len(df)] = {
'Image': image_file,
'Average Intensity': avg_intensity,
'Number of Objects': num_objects
}
# Display the result
df
Image | Average Intensity | Number of Objects | |
---|---|---|---|
0 | A9 p5d (cropped 1).tif | 26.269523 | 2 |
1 | A9 p5d (cropped 2).tif | 16.698528 | 2 |
2 | A9 p5d (cropped 3).tif | 34.847166 | 2 |
3 | A9 p5d (cropped 4).tif | 28.707185 | 2 |