Ahmed et al. 2025 / Modified U-Net

Four-Class Lumbar MRI Segmentation

This page summarizes a deep learning approach that classifies each MRI pixel into background, vertebrae, spinal canal, or intervertebral discs. The key idea is the combination of careful preprocessing, a modified U-Net, and a Combined Loss.

Key facts PDF and links Terminology Study notes

T2 SPACE segmentation workflow

MRI input

Modified U-Net softmax 4 classes

Predicted mask

Vertebrae Spinal Canal Intervertebral Discs

Paper PDF and Links

This section keeps the main paper, the ScienceDirect page, and the SPIDER dataset paper in one place for presentation preparation.

Main Paper PDF Open in new tab

Key Facts

The paper is important because it turns lumbar MRI segmentation into a cleaner four-class learning problem and reports very high performance on high-resolution T2 SPACE scans.

Dataset 218

Number of patients in the SPIDER lumbar MRI dataset.

Series 447

MRI series including T1, T2, and T2 SPACE sequences.

Classes 4

Background, vertebrae, spinal canal, and intervertebral discs.

Best Dice 0.97

Best reported performance on T2 SPACE images.

Data and Preprocessing

The original SPIDER data is stored as 3D MHA volumes. The paper converts it into 2D slices and merges detailed labels into four clinically useful classes.

Extract 2D slices from 3D MHA volumes

This reduces GPU memory cost and makes the data suitable for a 2D U-Net pipeline.

Merge detailed labels into four classes

Individual vertebra and disc IDs are converted into broader anatomical categories.

Filter imbalanced slices

Slices dominated by background or missing key structures are removed to stabilize training.

Compare T1, T2, and T2 SPACE

T2 SPACE achieves the best result because its higher resolution makes anatomical boundaries clearer.

SPIDER Label Mapping

0 Background

1-99 Vertebrae

100 Spinal Canal

200+ Intervertebral Discs

The key point is that the model predicts clinically useful anatomical categories, not individual anatomical IDs.

Model and Algorithm

The proposed model is based on U-Net. It compresses image features in the encoder and reconstructs a pixel-level segmentation map in the decoder.

Encoder

Convolution, Batch Normalization, Leaky ReLU, and Max Pooling extract increasingly abstract features.

Bottleneck

A 512-channel layer captures complex anatomical patterns and boundary information.

Skip Connection

Fine spatial information is passed from encoder to decoder to preserve boundaries.

Decoder

Transposed convolution restores resolution and reconstructs class-specific masks.

Softmax

Each pixel is assigned probabilities over the four output classes.

Combined Loss = 0.6 × Focal Loss + 0.4 × Dice Loss

Leaky ReLU

Keeps a small gradient for negative inputs and reduces the risk of inactive neurons.

Glorot Initialization

Stabilizes the starting weight distribution and helps gradients flow through deeper layers.

Focal + Dice

Balances hard-pixel learning with direct optimization of mask overlap.

Performance Metrics

Dice is the main metric. A value closer to 1 means stronger overlap between the predicted mask and the ground-truth annotation.

構造	Dice	IoU	意味
Intervertebral Discs	0.9688	0.9476	Thin disc structures are segmented with high overlap.
Vertebrae	0.9712	0.9461	Large bony structures are segmented consistently.
Spinal Canal	0.9671	0.9501	The long canal-like structure remains accurate despite its shape.

Dice

Measures overlap between prediction and ground truth. It is the easiest main metric to explain.

IoU

Intersection divided by union. It is usually stricter than Dice.

ASD / NSD

Boundary-distance metrics used to evaluate how far predicted surfaces deviate from the annotation.

Presentation note: Dice around 0.97 is very high, but the reported result should be discussed carefully because test-set details and preprocessing choices affect generalization.

Terminology

Use this section to quickly explain the technical terms during an English presentation.

Segmentation

A pixel-level classification task. In this project, each pixel is assigned to background, vertebrae, spinal canal, or discs.

U-Net

A widely used medical image segmentation architecture with an encoder, decoder, and skip connections.

T2 SPACE

A high-resolution 3D T2-weighted MRI sequence. It produces clearer anatomical boundaries.

Dice Coefficient

A measure of overlap between prediction and ground truth. Higher is better, with 1 meaning perfect overlap.

IoU

Intersection over Union. It divides the overlapping region by the total combined region.

Focal Loss

A loss function that gives more weight to hard-to-classify pixels.

Dice Loss

A loss function that directly optimizes the overlap between predicted and true masks.

Leaky ReLU

An activation function that keeps a small gradient for negative values.

Glorot / Xavier Initialization

A weight initialization method designed to keep training stable in deep networks.

Class Imbalance

A situation where some classes, such as background, dominate the image and can bias learning.

Graduation Project Plan

For the project, the first goal is reproduction. The second goal is improvement through model, loss, augmentation, and error-analysis experiments.

Stage 1

Reproduce the paper

Implement the SPIDER preprocessing pipeline, four-class labels, Modified U-Net, and Combined Loss.

Stage 2

Improve the method

Compare Attention U-Net, U-Net++, Boundary Loss, Tversky Loss, and stronger augmentation.

Stage 3

Visualize and analyze

Overlay predictions on MRI images and identify which structures or sequences fail most often.

Stage 4

Prepare the presentation

Explain the clinical motivation, technical method, reproduction result, limitations, and proposed improvements.