zip for Processing Paired Folders

zip for Processing Paired Folders#

In this notebook, we will use the Python built-in function zip to iterate over paired folders of images and label masks. Specifically, we will process images and their corresponding masks from the following directories:

  • data/BBBC007/images

  • data/BBBC007/masks

We’ll calculate the average intensity of labeled objects and the number of objects in each pair of image and mask files, and store the results in a pandas DataFrame.

import os
import pandas as pd
from skimage import io, measure
import numpy as np
# Define paths
image_folder = '../../data/BBBC007/images'
mask_folder = '../../data/BBBC007/masks'

Before starting, we just have a look at the folder contents to see if there are indeed paired files.

image_files = sorted(os.listdir(image_folder))
image_files
['A9 p5d (cropped 1).tif',
 'A9 p5d (cropped 2).tif',
 'A9 p5d (cropped 3).tif',
 'A9 p5d (cropped 4).tif']
mask_files = sorted(os.listdir(mask_folder))
mask_files
['A9 p5d (cropped 1).tif',
 'A9 p5d (cropped 2).tif',
 'A9 p5d (cropped 3).tif',
 'A9 p5d (cropped 4).tif']
df = pd.DataFrame(columns=['Image', 'Average Intensity', 'Number of Objects'])

To demonstrate how zip() allows iterate over image and mask files in parallel, we just print out file names in a short for-loop:

for image_file, mask_file in zip(image_files, mask_files):
    image_path = os.path.join(image_folder, image_file)
    mask_path = os.path.join(mask_folder, mask_file)
    
    print(image_path, mask_path, "\n\n")
../../data/BBBC007/images\A9 p5d (cropped 1).tif ../../data/BBBC007/masks\A9 p5d (cropped 1).tif 


../../data/BBBC007/images\A9 p5d (cropped 2).tif ../../data/BBBC007/masks\A9 p5d (cropped 2).tif 


../../data/BBBC007/images\A9 p5d (cropped 3).tif ../../data/BBBC007/masks\A9 p5d (cropped 3).tif 


../../data/BBBC007/images\A9 p5d (cropped 4).tif ../../data/BBBC007/masks\A9 p5d (cropped 4).tif 

The same code can be used to go through both folders in parallel and analyse intensity images paired with given label images.

for image_file, mask_file in zip(image_files, mask_files):
    image_path = os.path.join(image_folder, image_file)
    mask_path = os.path.join(mask_folder, mask_file)
    
    # Read the image and its mask
    image = io.imread(image_path)
    mask = io.imread(mask_path).astype(np.uint32)

    # Measure labeled regions
    labeled_regions = measure.regionprops(mask, intensity_image=image)

    # Calculate average intensity and number of objects
    num_objects = len(labeled_regions)
    avg_intensity = sum(region.mean_intensity for region in labeled_regions) / num_objects

    # Append results for the current pair
    df.loc[len(df)] = {
        'Image': image_file,
        'Average Intensity': avg_intensity,
        'Number of Objects': num_objects
    }

# Display the result
df
Image Average Intensity Number of Objects
0 A9 p5d (cropped 1).tif 26.269523 2
1 A9 p5d (cropped 2).tif 16.698528 2
2 A9 p5d (cropped 3).tif 34.847166 2
3 A9 p5d (cropped 4).tif 28.707185 2