Feature correlation

Feature correlation#

When inspecting feature extraction results, it is often important to take releationships between features into account. Therefore, a feature correlation matrix is a useful tool. Visualizing it in color is recommended.

from napari_simpleitk_image_processing import label_statistics
import numpy as np
import seaborn
import pyclesperanto_prototype as cle
import matplotlib.pyplot as plt

Load data#

We first load the image data that will be used for feature extraction.

# Load data
image = cle.imread("../../data/Lund-25MB.tif")

# Segment nuclei
background_subtracted = cle.top_hat_box(image, radius_x=5, radius_y=5)
labels = cle.voronoi_otsu_labeling(background_subtracted, spot_sigma=1)

# Feature extraction
nuclei_statistics = label_statistics(image, labels, 
                                     intensity=True, 
                                     size=True, 
                                     shape=True, 
                                     perimeter=True,
                                     moments=True)

# Feature selection
selected_table = nuclei_statistics[
    [
       # likely unrelated features
       'label', 
        
       # intensity releated features
       'maximum', 'mean', 'minimum', 'variance',
        
       # shape related features 
       'elongation', 'feret_diameter', 'flatness', 'roundness',

       # size related features
       'equivalent_spherical_radius', 'number_of_pixels', 'perimeter'
    ]
]
selected_table
C:\Users\haase\miniconda3\envs\bio311\Lib\site-packages\pyclesperanto_prototype\_tier9\_imread.py:5: UserWarning: cle.imread is deprecated, use skimage.io.imread instead.
  warnings.warn("cle.imread is deprecated, use skimage.io.imread instead.")
label maximum mean minimum variance elongation feret_diameter flatness roundness equivalent_spherical_radius number_of_pixels perimeter
0 1 143.0 117.489451 93.0 90.056032 1.228690 8.774964 1.153618 0.965657 3.839016 237 191.790349
1 2 113.0 83.052219 65.0 94.086271 1.325096 13.152946 1.215572 0.818905 4.505089 383 311.446414
2 3 130.0 108.930403 92.0 57.109109 1.565911 12.884099 1.434476 0.807173 4.024309 273 252.130963
3 4 129.0 94.576991 70.0 130.716136 1.227027 14.352700 1.397276 0.833006 5.128456 565 396.766310
4 5 149.0 119.454545 89.0 144.431321 1.429829 10.723805 1.269121 0.871680 4.034113 275 234.611278
... ... ... ... ... ... ... ... ... ... ... ... ...
1195 1196 60.0 42.118257 29.0 50.270809 1.107046 11.090537 1.307962 0.976306 4.863917 482 304.506355
1196 1197 83.0 47.673267 29.0 159.330772 1.046951 12.409674 1.236147 0.962602 5.526416 707 398.703613
1197 1198 53.0 41.502890 30.0 28.123180 1.042599 9.643651 1.330995 0.965376 4.355077 346 246.890816
1198 1199 72.0 45.091570 29.0 106.316202 1.114285 12.961481 1.269182 0.962037 5.476460 688 391.758021
1199 1200 66.0 44.232682 29.0 67.484909 1.203239 12.206556 1.381601 0.961668 5.122397 563 342.871234

1200 rows × 12 columns

Correlation matrix#

We examine the correlation between the selected features to understand their relationships.

df = selected_table.corr()

def colorize(styler):
    styler.background_gradient(axis=None, cmap="coolwarm")
    return styler
df.style.pipe(colorize)
  label maximum mean minimum variance elongation feret_diameter flatness roundness equivalent_spherical_radius number_of_pixels perimeter
label 1.000000 -0.605035 -0.651268 -0.581233 -0.134539 -0.014857 0.105859 -0.066384 0.381267 0.251968 0.246869 0.190365
maximum -0.605035 1.000000 0.824653 0.577706 0.563160 -0.028076 0.144944 0.025563 -0.485114 -0.011892 -0.035078 0.068570
mean -0.651268 0.824653 1.000000 0.918750 0.052848 0.122359 -0.173872 0.112322 -0.644827 -0.451865 -0.478943 -0.362011
minimum -0.581233 0.577706 0.918750 1.000000 -0.273489 0.217240 -0.311868 0.148296 -0.600965 -0.615060 -0.604247 -0.521081
variance -0.134539 0.563160 0.052848 -0.273489 1.000000 -0.191963 0.370870 -0.084841 0.069065 0.485770 0.490167 0.500228
elongation -0.014857 -0.028076 0.122359 0.217240 -0.191963 1.000000 0.184445 0.091196 -0.418459 -0.152117 -0.125144 -0.083723
feret_diameter 0.105859 0.144944 -0.173872 -0.311868 0.370870 0.184445 1.000000 0.083095 -0.201787 0.854090 0.785360 0.896780
flatness -0.066384 0.025563 0.112322 0.148296 -0.084841 0.091196 0.083095 1.000000 -0.438565 -0.111196 -0.137907 -0.085824
roundness 0.381267 -0.485114 -0.644827 -0.600965 0.069065 -0.418459 -0.201787 -0.438565 1.000000 0.225241 0.330182 0.137811
equivalent_spherical_radius 0.251968 -0.011892 -0.451865 -0.615060 0.485770 -0.152117 0.854090 -0.111196 0.225241 1.000000 0.948357 0.976222
number_of_pixels 0.246869 -0.035078 -0.478943 -0.604247 0.490167 -0.125144 0.785360 -0.137907 0.330182 0.948357 1.000000 0.964439
perimeter 0.190365 0.068570 -0.362011 -0.521081 0.500228 -0.083723 0.896780 -0.085824 0.137811 0.976222 0.964439 1.000000