Automated Dose-Based Anatomic Region Classification of Radiotherapy Treatment for Big Data Applications

Imagine you have a massive library containing over 100,000 books (radiotherapy treatment plans). Each book tells the story of how a doctor treated a patient's cancer. However, there's a huge problem: the books are all labeled with messy, inconsistent handwriting. One doctor might write "Lung Cancer," another might write "Chest Tumor," and a third might just write "Plan 402."

If you wanted to find every book about lung cancer to study them, you'd have to read every single one manually. That would take years. This is the "Big Data" problem in radiation oncology: the data is there, but it's too messy to use.

This paper introduces a smart, automated librarian that solves this problem without ever reading the messy text labels.

The Problem: Why "Reading the Labels" Fails

Usually, computers try to sort these plans by reading the text names (like "Thorax" or "Pelvis"). But in the real world, doctors use different naming styles, abbreviations, or sometimes generic names like "New Structure." It's like trying to sort a library by asking the books to shout their own titles; sometimes they shout the right thing, but often they are silent or shouting nonsense.

The Solution: The "X-Ray Vision" Librarian

Instead of reading the text, the authors built software that looks at the actual picture of the treatment.

Think of a radiation treatment plan as a 3D map. It shows:

The Patient's Body: A digital CT scan (like a high-tech X-ray).
The "Laser" Beam: The planned dose of radiation (the "paint" the doctor intends to spray on the tumor).

The new software uses Deep Learning (a type of AI that learns by looking at thousands of examples) to act like a super-fast anatomist. It automatically draws outlines around 118 different body parts—organs, bones, glands—just by looking at the CT scan. It doesn't need a human to draw these lines; it does it in seconds.

How It Works: The "Paint and Overlap" Game

Once the AI has drawn the outlines of the body parts, it plays a game of "Where did the paint land?"

The Paint: The software looks at the "high-dose" area (where the radiation is strongest, like the center of a target).
The Overlap: It checks which body parts are covered by this "paint."
- If the high-dose paint covers the liver and stomach, the AI says, "This is an Abdomen plan."
- If it covers the lungs and heart, it says, "This is a Thorax (Chest) plan."
- If it covers the brain, it says, "This is a Cranial plan."

It doesn't care what the doctor called the plan. It only cares about where the radiation actually went. It's like sorting mail not by the address written on the envelope, but by looking at the stamp and the postmark to figure out where it's going.

The "Decision Tree" Logic

The software is smart enough to handle tricky situations. Imagine a treatment that covers the lower neck and the upper chest.

Step 1: It checks the "hot spot" (the most intense radiation). If it's mostly in the neck, it labels it "Head and Neck."
Step 2: If the radiation is spread out or weak, it looks at the "warm zone" (a slightly larger area) to see what else it touches.
Step 3: If it's still unclear, it uses a "tie-breaker" rule, like checking which bone is closest to the center of the radiation.

The Results: How Good Is It?

The team tested this "robot librarian" on 100 real patient plans and compared its labels to those given by human experts.

95% of the time, the robot got the primary location (the most important one) exactly right.
91% of the time, it got the entire list of locations and their order exactly right.

The few times it got it "wrong," it wasn't because the robot was confused. It was usually because the case was genuinely ambiguous (e.g., a tumor right on the border between the pelvis and the leg). In fact, sometimes the robot was arguably more accurate because it saw the radiation touching a body part that the human expert decided to ignore based on a strict rule.

Why This Matters

This is a game-changer for medical research.

Before: Researchers had to hire armies of humans to manually sort through databases, which was slow, expensive, and prone to error.
Now: This software can automatically sort 100,000 plans in a matter of hours. It creates a clean, organized database where researchers can instantly find "all the lung cancer cases" or "all the prostate cases" to study treatment outcomes.

The Bottom Line

The authors built a tool that ignores the messy text labels and instead uses visual evidence (where the radiation actually hits the body) to sort medical data. It's like teaching a computer to understand a map by looking at the terrain, rather than reading the street signs. This makes "Big Data" in cancer treatment finally usable, reliable, and ready to help save lives.

1. Problem Statement

The advancement of "Big Data" in radiation oncology is hindered by the difficulty of curating large-scale treatment plan databases (100,000+ patients).

Metadata Reliability: Current methods for identifying treatment sites rely on text-based plan labels or target nomenclature. These are often inconsistent, non-standardized, or ambiguous (e.g., generic names like "ROI_1"), making them unreliable for multi-institutional queries.
Manual Curation Bottleneck: Manually labeling anatomic regions is time-consuming, prone to human error/fatigue, and requires specialized clinical knowledge, rendering it infeasible for large datasets.
Limitations of NLP: While Natural Language Processing (NLP) can parse text, it cannot resolve ambiguities where the text label does not match the physical dose distribution geometry.

2. Methodology

The authors propose a fully automated, image-based workflow that classifies radiotherapy plans into six anatomic regions (Cranial, Head and Neck, Thorax, Abdomen, Pelvis, Extremity) based on the physical intersection of the delivered dose and anatomy, independent of text metadata.

The workflow consists of three stages:

A. Data Preprocessing & Autosegmentation

Input: DICOM files (CT, RTDose, RTStruct, RTPlan).
Cropping: Volumes are cropped to the region containing 50% of the maximum dose (plus a buffer) to ensure sufficient anatomical context while reducing computational load.
Deep Learning Segmentation: The system utilizes TotalSegmentator (three modules: total, head_glands_cavities, appendicular_bones) to auto-segment 118 structures (organs, glands, bones).
- Special Handling: Bowel structures are split into "pelvic" and "abdominal" based on the iliac crest.
- Breast Segmentation: A custom MedNeXt architecture is triggered if rib contours are detected to segment left/right breasts.

B. Dose-Mask Intersection Analysis

The RTDose volume is resampled to match the CT grid.
Isodose Volumes: The system generates masks for the 85% and 50% isodose lines ( $V_{85\%}$ and $V_{50\%}$ ).
Scoring: A "dose-overlap score" is calculated for each organ ( $O$ ) against the isodose volume ( $D$ ):
$Score = \frac{V(O \cap D)}{V(D)}$
This represents the fraction of the isodose volume occupied by a specific organ.

C. Hierarchical Labeling Algorithm

The algorithm assigns labels using a decision tree (Figure 1 in the paper):

Primary High-Dose Check: Identifies organs overlapping $V_{85\%}$ . Regions are ranked by the highest single-organ overlap score.
Special Conditions (Fallbacks):
- Cranial: If Brain Dice Score > 0.4.
- Head & Neck: If the Center of Mass (CoM) of $V_{85\%}$ is superior to the C7 vertebra (or Thyroid).
Secondary Moderate-Dose Check: If no high-score overlap is found, the analysis repeats using $V_{50\%}$ .
Proximity Fallback: If no overlap score exceeds 0.50, the label is assigned to the organ axially closest to the $V_{85\%}$ CoM.
Termination: If no match is found, no label is assigned to prevent false positives.

3. Key Contributions

Metadata Independence: The system eliminates reliance on inconsistent text labels by deriving anatomic regions directly from geometric dose-anatomy relationships.
Scalability: The workflow is designed for high-throughput processing of massive datasets (100k+ patients), running on standard GPU hardware.
Hierarchical Logic: The algorithm incorporates clinical heuristics (e.g., C7 boundary, Brain Dice thresholds) to handle edge cases where simple overlap metrics might fail.
Granular Output: It generates JSON files containing per-organ overlap metrics and ranked regional labels, enabling complex downstream queries.

4. Results

The algorithm was validated against a ground truth of 100 consecutive clinical plans labeled by experts using a strict protocol (requiring 5mm dose penetration depth).

Accuracy Metrics:
- Exact Accuracy: 91% (Automated labels matched expert labels in both content and order).
- Top-2 Accuracy: 94% (The correct set of top two regions was identified, regardless of order).
- Top-1 Accuracy: 95% (The primary treatment site was correctly identified).
Per-Class Performance:
- Perfect Top-2 sensitivity and precision for Cranial and Head & Neck.
- High performance across all regions, with the lowest Top-2 precision observed in Extremity (0.80) due to segmentation ambiguities.
Error Analysis:
- No Non-Adjacent Errors: The system never skipped regions (e.g., labeling Abdomen and Head/Neck while skipping Thorax).
- Discrepancy Causes: Most errors occurred in "borderline" cases (grazing doses) or due to upstream segmentation limitations (e.g., TotalSegmentator failing to segment the anus or distinguish femoral heads from shafts).

5. Significance and Future Work

Clinical Utility: The 95% Top-1 Accuracy makes the tool highly reliable for querying databases to find cohorts for retrospective studies or AI training (e.g., "Find all Thorax plans").
Standardization: Provides a reproducible, objective method for data curation that overcomes the variability of human labeling and inconsistent institutional naming conventions.
Limitations & Future Directions:
- Segmentation Dependency: Performance is tied to the fidelity of the upstream deep learning models. Future work includes integrating more granular models (e.g., anal canal, mediastinum) and geometric post-processing to fix femoral head segmentation.
- Dose-Weighted CoM: Future iterations will use a dose-weighted Center of Mass for fallback logic to better reflect the high-dose core rather than geometric volume.
- Generalizability: Validation on multi-institutional datasets is required to confirm robustness across different planning styles and hardware.

Conclusion: This study presents a scalable, robust solution for automating the curation of radiotherapy big data, shifting the paradigm from text-based reliance to objective, dose-volume-based classification.