Statistics using SimpleITK#
We can use SimpleITK for extracting features from label images. For convenience reasons we use the napari-simpleitk-image-processing library.
import numpy as np
import pandas as pd
from skimage.io import imread
from pyclesperanto_prototype import imshow
from skimage import measure
import pyclesperanto_prototype as cle
from skimage import filters
from napari_simpleitk_image_processing import label_statistics
# load image
image = imread("../../data/blobs.tif")
# denoising
blurred_image = filters.gaussian(image, sigma=1)
# binarization
threshold = filters.threshold_otsu(blurred_image)
thresholded_image = blurred_image >= threshold
# labeling
label_image = measure.label(thresholded_image)
# visualization
imshow(label_image, labels=True)
Measurements/ region properties#
We are now using the very handy function label_statistics
which provides a table of features. Let us check first what we need to provide for this function:
label_statistics?
Signature:
label_statistics(
intensity_image: 'napari.types.ImageData',
label_image: 'napari.types.LabelsData',
size: bool = True,
intensity: bool = True,
perimeter: bool = False,
shape: bool = False,
position: bool = False,
moments: bool = False,
napari_viewer: 'napari.Viewer' = None,
) -> 'pandas.DataFrame'
Docstring:
Measure intensity/shape/... statistics per label
Parameters
----------
intensity_image: ndarray, optional
Can be None
label_image: ndarray
Must be subsequently labeled
size: bool, optional
intensity: bool, optional
perimeter: bool, optional
shape: bool, optional
position: bool, optional
moments: bool, optional
napari_viewer: napari.Viewer, optional
Returns
-------
pandas DataFrame, in case napari_viewr is None, otherwise the DataFrame will be added to
the passed label_image's layer.features
See Also
--------
..[0] https://simpleitk.org/doxygen/latest/html/classitk_1_1simple_1_1LabelShapeStatisticsImageFilter
..[1] http://insightsoftwareconsortium.github.io/SimpleITK-Notebooks/Python_html/35_Segmentation_Shape_Analysis.html
File: c:\users\maral\mambaforge\envs\feature_blogpost\lib\site-packages\napari_simpleitk_image_processing\_simpleitk_image_processing.py
Type: function
Feature categories which are set to True
are measured by default. In this case, the categories are size
and intensity
. But the rest might be also interesting to investigate. So we need to set them to True
as well:
df = pd.DataFrame(label_statistics(image, label_image,
shape=True,
perimeter=True,
position=True,
moments=True))
df
label | maximum | mean | median | minimum | sigma | sum | variance | bbox_0 | bbox_1 | ... | number_of_pixels_on_border | perimeter | perimeter_on_border | perimeter_on_border_ratio | principal_axes0 | principal_axes1 | principal_axes2 | principal_axes3 | principal_moments0 | principal_moments1 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 232.0 | 191.440559 | 200.0 | 128.0 | 29.827923 | 82128.0 | 889.704987 | 10 | 0 | ... | 16 | 87.070368 | 16.0 | 0.183759 | 0.905569 | 0.424199 | -0.424199 | 0.905569 | 17.336255 | 75.599678 |
1 | 2 | 224.0 | 179.846995 | 184.0 | 128.0 | 21.328889 | 32912.0 | 454.921516 | 53 | 0 | ... | 21 | 53.456120 | 21.0 | 0.392846 | -0.042759 | -0.999085 | 0.999085 | -0.042759 | 8.637199 | 27.432794 |
2 | 3 | 248.0 | 205.604863 | 208.0 | 120.0 | 29.414615 | 135288.0 | 865.219581 | 95 | 0 | ... | 23 | 93.409370 | 23.0 | 0.246228 | 0.989650 | 0.143505 | -0.143505 | 0.989650 | 49.994764 | 56.996778 |
3 | 4 | 248.0 | 217.515012 | 232.0 | 120.0 | 35.893817 | 94184.0 | 1288.366094 | 144 | 0 | ... | 20 | 76.114262 | 20.0 | 0.262763 | 0.902854 | 0.429947 | -0.429947 | 0.902854 | 33.290649 | 37.542552 |
4 | 5 | 248.0 | 213.033898 | 224.0 | 128.0 | 28.771575 | 100552.0 | 827.803519 | 237 | 0 | ... | 39 | 82.127941 | 40.0 | 0.487045 | 0.999090 | 0.042642 | -0.042642 | 0.999090 | 24.209327 | 60.391416 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
57 | 58 | 224.0 | 184.525822 | 192.0 | 120.0 | 28.322029 | 39304.0 | 802.137302 | 39 | 232 | ... | 0 | 52.250114 | 0.0 | 0.000000 | 0.976281 | 0.216509 | -0.216509 | 0.976281 | 13.084485 | 21.981750 |
58 | 59 | 248.0 | 184.810127 | 184.0 | 128.0 | 33.955505 | 14600.0 | 1152.976306 | 170 | 248 | ... | 18 | 39.953250 | 18.0 | 0.450527 | -0.012197 | -0.999926 | 0.999926 | -0.012197 | 2.075392 | 20.902016 |
59 | 60 | 216.0 | 182.727273 | 184.0 | 128.0 | 24.557101 | 16080.0 | 603.051202 | 117 | 249 | ... | 22 | 46.196967 | 22.0 | 0.476222 | -0.014920 | -0.999889 | 0.999889 | -0.014920 | 1.815666 | 29.359308 |
60 | 61 | 248.0 | 189.538462 | 192.0 | 128.0 | 38.236858 | 9856.0 | 1462.057315 | 227 | 249 | ... | 15 | 32.924135 | 15.0 | 0.455593 | -0.013675 | -0.999906 | 0.999906 | -0.013675 | 1.592570 | 12.843450 |
61 | 62 | 224.0 | 173.833333 | 176.0 | 128.0 | 28.283770 | 8344.0 | 799.971631 | 66 | 250 | ... | 17 | 35.375614 | 17.0 | 0.480557 | 0.030675 | -0.999529 | 0.999529 | 0.030675 | 0.917610 | 17.904873 |
62 rows × 33 columns
These are all columns that are available:
print(df.keys())
Index(['label', 'maximum', 'mean', 'median', 'minimum', 'sigma', 'sum',
'variance', 'bbox_0', 'bbox_1', 'bbox_2', 'bbox_3', 'centroid_0',
'centroid_1', 'elongation', 'feret_diameter', 'flatness', 'roundness',
'equivalent_ellipsoid_diameter_0', 'equivalent_ellipsoid_diameter_1',
'equivalent_spherical_perimeter', 'equivalent_spherical_radius',
'number_of_pixels', 'number_of_pixels_on_border', 'perimeter',
'perimeter_on_border', 'perimeter_on_border_ratio', 'principal_axes0',
'principal_axes1', 'principal_axes2', 'principal_axes3',
'principal_moments0', 'principal_moments1'],
dtype='object')
df.describe()
label | maximum | mean | median | minimum | sigma | sum | variance | bbox_0 | bbox_1 | ... | number_of_pixels_on_border | perimeter | perimeter_on_border | perimeter_on_border_ratio | principal_axes0 | principal_axes1 | principal_axes2 | principal_axes3 | principal_moments0 | principal_moments1 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | ... | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 | 62.000000 |
mean | 31.500000 | 233.548387 | 190.429888 | 196.258065 | 125.161290 | 28.767558 | 69534.451613 | 864.079980 | 121.532258 | 112.629032 | ... | 5.564516 | 67.071263 | 5.580645 | 0.102864 | 0.703539 | -0.114112 | 0.114112 | 0.703539 | 20.742486 | 43.535735 |
std | 18.041619 | 19.371838 | 15.382559 | 19.527144 | 4.602898 | 6.091478 | 42911.182492 | 311.860596 | 80.574925 | 76.921129 | ... | 9.667625 | 23.507575 | 9.724986 | 0.173858 | 0.459158 | 0.537821 | 0.537821 | 0.459158 | 13.092549 | 32.666912 |
min | 1.000000 | 152.000000 | 146.285714 | 144.000000 | 112.000000 | 6.047432 | 1024.000000 | 36.571429 | 0.000000 | 0.000000 | ... | 0.000000 | 9.155272 | 0.000000 | 0.000000 | -0.651619 | -0.999926 | -0.660963 | -0.651619 | 0.489796 | 0.571429 |
25% | 16.250000 | 232.000000 | 182.969505 | 186.000000 | 120.000000 | 26.561523 | 36010.000000 | 705.524752 | 53.000000 | 43.250000 | ... | 0.000000 | 52.551616 | 0.000000 | 0.000000 | 0.752633 | -0.624188 | -0.318631 | 0.752633 | 11.068721 | 22.027192 |
50% | 31.500000 | 240.000000 | 190.749492 | 200.000000 | 128.000000 | 29.089645 | 71148.000000 | 846.217207 | 121.000000 | 110.500000 | ... | 0.000000 | 68.204464 | 0.000000 | 0.000000 | 0.937869 | 0.040942 | -0.040942 | 0.937869 | 18.744728 | 35.367902 |
75% | 46.750000 | 248.000000 | 199.725305 | 208.000000 | 128.000000 | 32.583571 | 99962.000000 | 1061.691674 | 197.250000 | 172.250000 | ... | 12.500000 | 84.307520 | 12.500000 | 0.184963 | 0.994807 | 0.318631 | 0.624188 | 0.994807 | 29.519182 | 56.938641 |
max | 62.000000 | 248.000000 | 220.026144 | 240.000000 | 136.000000 | 38.374793 | 177944.000000 | 1472.624704 | 251.000000 | 250.000000 | ... | 39.000000 | 125.912897 | 40.000000 | 0.487045 | 1.000000 | 0.660963 | 0.999926 | 1.000000 | 49.994764 | 186.225041 |
8 rows × 33 columns
Specific measures#
SimpleITK offers some non-standard measurements which deserve additional documentation
number_of_pixels_on_border
#
First, we check its range on the above example image.
number_of_pixels_on_border = df['number_of_pixels_on_border'].tolist()
np.min(number_of_pixels_on_border), np.max(number_of_pixels_on_border)
(0, 39)
Next, we visualize the measurement in space.
number_of_pixels_on_border_map = cle.replace_intensities(label_image, [0] + number_of_pixels_on_border)
number_of_pixels_on_border_map
cle._ image
|
The visualization suggests that number_of_pixels_on_border
is the count of pixels within a label that is located at the image border.
perimeter_on_border
#
perimeter_on_border = df['perimeter_on_border'].tolist()
np.min(perimeter_on_border), np.max(perimeter_on_border)
perimeter_on_border_map = cle.replace_intensities(label_image, [0] + perimeter_on_border)
perimeter_on_border_map
cle._ image
|
The visualization suggests that perimeter_on_border
is the perimeter of labels that are located at the image border.
perimeter_on_border_ratio
#
In this context, the SimpleITK documentation points to a publication which mentiones “‘SizeOnBorder’ is the number of pixels in the objects which are on the border of the image.” While the documentation is wage, the perimeter_on_border_ratio
may be related.
perimeter_on_border_ratio = df['perimeter_on_border_ratio'].tolist()
np.min(perimeter_on_border_ratio), np.max(perimeter_on_border_ratio)
perimeter_on_border_ratio_map = cle.replace_intensities(label_image, [0] + perimeter_on_border_ratio)
perimeter_on_border_ratio_map
cle._ image
|
The visualization reinforces our assumption that the perimeter_on_border_ratio
indeed is related to the amount the object touches the image border.
principal axes
#
In our example, we have principal_axes0
- principal_axes3
principal_axes_0 = df['principal_axes0'].tolist()
np.min(principal_axes_0), np.max(principal_axes_0)
principal_axes_0_map = cle.replace_intensities(label_image, [0] + principal_axes_0)
principal_axes_0_map
cle._ image
|
(principal_axes3
looks the same)
principal_axes_1 = df['principal_axes1'].tolist()
np.min(principal_axes_1), np.max(principal_axes_1)
principal_axes_1_map = cle.replace_intensities(label_image, [0] + principal_axes_1)
principal_axes_1_map
cle._ image
|
principal_axes_2 = df['principal_axes2'].tolist()
np.min(principal_axes_2), np.max(principal_axes_2)
principal_axes_2_map = cle.replace_intensities(label_image, [0] + principal_axes_2)
principal_axes_2_map
cle._ image
|
principal moments
#
principal_moments_0 = df['principal_moments0'].tolist()
np.min(principal_moments_0), np.max(principal_moments_0)
principal_moments_0_map = cle.replace_intensities(label_image, [0] + principal_moments_0)
principal_moments_0_map
cle._ image
|
principal_moments_1 = df['principal_moments1'].tolist()
np.min(principal_moments_1), np.max(principal_moments_1)
principal_moments_1_map = cle.replace_intensities(label_image, [0] + principal_moments_1)
principal_moments_1_map
cle._ image
|