{
"cells": [
{
"cell_type": "markdown",
"id": "86c146e0-c557-43f5-b372-4899c856f299",
"metadata": {},
"source": [
"# Jaccard-Index versus Accuracy\n",
"\n",
"Depending on the use-case some metrics are sub-optimal for determining segmentation quality. We demonstrate this by comparing segmentation results on differently cropped images.\n",
"\n",
"See also:\n",
"* [Maier-Hein, Reinke et al. (Arxiv 2023). Metrics reloaded: Pitfalls and recommendations for image analysis validation\n",
"](https://arxiv.org/abs/2206.01653)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "abb6988a-077c-474a-9255-8d23b5aeb48c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from skimage.data import human_mitosis\n",
"from the_segmentation_game import metrics\n",
"import napari_segment_blobs_and_things_with_membranes as nsbatwm\n",
"import stackview"
]
},
{
"cell_type": "markdown",
"id": "425d990b-d660-4676-b076-261255eefd71",
"metadata": {},
"source": [
"We use the `human_mitosis` example dataset from scikit-image."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "2ff66847-93cf-4551-befe-2f0da40f21e2",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"\n",
" \n",
" | \n",
"\n",
"\n",
"\n",
"shape | (70, 70) | \n",
"dtype | uint8 | \n",
"size | 4.8 kB | \n",
"min | 8 | max | 79 | \n",
" \n",
" \n",
" | \n",
"
\n",
"
"
],
"text/plain": [
"StackViewNDArray([[10, 11, 9, ..., 11, 11, 10],\n",
" [10, 10, 11, ..., 12, 12, 11],\n",
" [ 9, 9, 10, ..., 12, 11, 11],\n",
" ...,\n",
" [10, 9, 9, ..., 11, 12, 11],\n",
" [10, 10, 10, ..., 13, 12, 12],\n",
" [10, 10, 10, ..., 13, 13, 13]], dtype=uint8)"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"image = human_mitosis()[95:165, 384:454]\n",
"\n",
"stackview.insight(image)"
]
},
{
"cell_type": "markdown",
"id": "5a499359-ffed-449a-9311-fa51f2a474f8",
"metadata": {},
"source": [
"Let's assume this is a reference annotation performed by an expert."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "8682b2a0-c471-415c-a08b-a33ae0c272a4",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
" \n",
" | \n",
"\n",
"nsbatwm made image \n",
"\n",
"shape | (70, 70) | \n",
"dtype | int32 | \n",
"size | 19.1 kB | \n",
"min | 0 | max | 3 | \n",
" \n",
"\n",
" | \n",
"
\n",
"
"
],
"text/plain": [
"StackViewNDArray([[0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" ...,\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0]])"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"reference_labels = nsbatwm.voronoi_otsu_labeling(image)\n",
"reference_labels"
]
},
{
"cell_type": "markdown",
"id": "4d037de8-23a1-4e5d-90c3-b2fec0fcff3f",
"metadata": {},
"source": [
"Furthermore, this create a segmentation result we would like to determine the quality of."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "08a19089-92ea-4920-aa44-b019faf6ae5b",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
" \n",
" | \n",
"\n",
"nsbatwm made image \n",
"\n",
"shape | (70, 70) | \n",
"dtype | int32 | \n",
"size | 19.1 kB | \n",
"min | 0 | max | 3 | \n",
" \n",
"\n",
" | \n",
"
\n",
"
"
],
"text/plain": [
"StackViewNDArray([[0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" ...,\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0]])"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"test_labels = nsbatwm.gauss_otsu_labeling(image, outline_sigma=3)\n",
"\n",
"test_labels"
]
},
{
"cell_type": "markdown",
"id": "53229a19-b2bb-4910-9f9d-682387975ce3",
"metadata": {},
"source": [
"## Quality measurement\n",
"There are plenty of quality metrics for measuring how well the two label images fit to each other. In the following we use [accuracy and jaccard index as implemented in The Segmentation Game](https://github.com/haesleinhuepf/the-segmentation-game#metrics), a napari-plugin for measuring quality metrics of segmentation results."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "9a3ff87a-4653-417f-9f60-1d0a551f58cc",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"0.9744898"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"metrics.roc_accuracy_binary(reference_labels, test_labels)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "e990bce5-d0e9-483a-a90c-78b7d43cc549",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"0.7274754206261056"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"metrics.jaccard_index_sparse(reference_labels, test_labels)"
]
},
{
"cell_type": "markdown",
"id": "98f221fb-113e-4efa-9ffc-4fd349a4bcf3",
"metadata": {},
"source": [
"We will now apply the same metrics to the label image again, but crop the label image by removing some of the zero-value pixels in the top and left of the label image."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "a824e910-e224-4ae6-ba45-bf4ded55b895",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"0.95"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"metrics.roc_accuracy_binary(reference_labels[20:,20:], test_labels[20:,20:])"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "02d9bee7-834d-4675-8344-31cafbec1dde",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"0.7274754206261056"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"metrics.jaccard_index_sparse(reference_labels[20:,20:], test_labels[20:,20:])"
]
},
{
"cell_type": "markdown",
"id": "a252ee6f-809f-47a0-beb9-221ad23605ce",
"metadata": {},
"source": [
"As you can see, the accuracy metric changes, while the Jaccard Index does not. Obviously the accuracy metric depends on the amount of zero-value pixels in the label image. We just visualize the cropped images:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "dde2cf25-3480-43aa-92a5-d1de6cf5169b",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
" \n",
" | \n",
"\n",
"nsbatwm made made image \n",
"\n",
"shape | (50, 50) | \n",
"dtype | int32 | \n",
"size | 9.8 kB | \n",
"min | 0 | max | 3 | \n",
" \n",
"\n",
" | \n",
"
\n",
"
"
],
"text/plain": [
"StackViewNDArray([[0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" ...,\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0]])"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"reference_labels[20:,20:]"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "3505144b-d036-4d12-a903-15c8358f8ace",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"\n",
" \n",
" | \n",
"\n",
"nsbatwm made made image \n",
"\n",
"shape | (50, 50) | \n",
"dtype | int32 | \n",
"size | 9.8 kB | \n",
"min | 0 | max | 3 | \n",
" \n",
"\n",
" | \n",
"
\n",
"
"
],
"text/plain": [
"StackViewNDArray([[0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" ...,\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0],\n",
" [0, 0, 0, ..., 0, 0, 0]])"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"test_labels[20:,20:]"
]
},
{
"cell_type": "markdown",
"id": "217cf4bd-25f2-4135-9539-59f895d71f38",
"metadata": {},
"source": [
"## Explanation\n",
"When comparing the equations of accuracy $A$ and Jaccard index $J$, it is obvious that both do the same kind-of, but only accuracy includes the number of zero-value pixels in both label images. These pixels are the true-negatives $TN$.\n",
"\n",
"$$\n",
" A =\\frac{TP + TN}{FN + FP + TP + TN}\n",
"$$\n",
"\n",
"$$\n",
" J =\\frac{TP}{FN + FP + TP}\n",
"$$"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "58a5636b-467a-4865-9609-7735a1b2d98a",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
}
},
"nbformat": 4,
"nbformat_minor": 5
}