End-to-End PET/CT Interpretation and Quantification with an LLM-Orchestrated AI Agent: A Real-World Pilot Study

This pilot study demonstrates that an LLM-orchestrated AI agent can successfully automate the end-to-end workflow of PET/CT interpretation from raw DICOM data to structured reporting in 170 lung cancer patients, achieving perfect primary tumor detection while revealing systematic limitations in nodal and metastatic assessment that necessitate continued expert oversight.

Choi, H., Bae, S., Na, K. J.

Published 2026-02-25
📖 5 min read🧠 Deep dive
⚕️

This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

Imagine you are a master chef (the radiologist) trying to cook a complex meal (diagnosing a patient) using a massive, chaotic pantry of ingredients (the raw medical images). Usually, you have to spend hours sorting through boxes, measuring spices, chopping vegetables, and then finally cooking the dish.

This paper introduces a new AI Sous-Chef that doesn't just chop vegetables; it manages the entire kitchen workflow from start to finish.

Here is the breakdown of their "AI Agent" in simple terms:

1. The Problem: The Chaotic Pantry

In a hospital, PET/CT scans (which show where cancer is glowing in the body) come in all different shapes and sizes. Some are labeled weirdly, some have missing data, and the machines that take the pictures are all different.

  • Old AI: Previous AI tools were like specialized robots that could only do one thing perfectly, like "slice the onions" (find a tumor) or "measure the salt" (calculate a number). But they couldn't talk to each other, and they couldn't handle the mess of the real world.
  • The Goal: The researchers wanted an AI that could walk into the messy pantry, figure out which ingredients are good, chop them, cook them, and write the recipe card (the medical report) all by itself.

2. The Solution: The "Brain" and the "Hands"

The researchers built a system with three layers, which they call an LLM-Orchestrated Agent. Think of it like a Conductor of an Orchestra:

  • The Conductor (The "Brain"): This is a Large Language Model (like a super-smart text AI). It doesn't look at the pictures directly. Instead, it reads the "sheet music" (the patient's data and the doctor's request) and tells the other musicians what to do. It decides: "Okay, we need to find the lung tumor first. Let's ask Tool A to slice the image, then ask Tool B to measure the glow, and finally ask Tool C to write the summary."
  • The Musicians (The "Hands"): These are specialized AI tools that are already good at specific jobs.
    • The Slicer: Finds the tumors.
    • The Measurer: Calculates how "hot" (active) the tumors are.
    • The Painter: Draws the outlines on the images.
  • The Process: The Conductor grabs the raw, messy data, tells the Slicer to work, checks if the result looks right, and if it fails, it says, "Okay, try a different method," and keeps going until it has a full report.

3. The Test: The "Stress Test"

They tested this AI on 170 real patients with lung cancer. They gave the AI the raw, messy data and asked it to do the whole job: find the cancer, check if it spread to lymph nodes, check if it spread to other organs, and write a draft report.

The Results:

  • The Main Course (Primary Tumor): The AI was perfect. It found the main lung tumor in 100% of the cases. It was like a master chef who never misses the main ingredient.
  • The Side Dishes (Lymph Nodes): The AI was good at finding them (85% success) but got a bit paranoid. It often thought normal, healthy lymph nodes were cancerous (false alarms). It's like a sous-chef who thinks a speck of dust is a rock and tries to remove it.
  • The Dessert (Distant Metastasis): When checking if cancer spread to other organs (like the liver or bones), the AI was okay (about 70% success). It sometimes missed tiny, hidden spots (false negatives) or got confused by normal body processes (like digestion) that looked like cancer (false positives).

4. The Verdict: A Helpful Assistant, Not a Replacement

The researchers conclude that this AI is not ready to replace the doctor.

  • Why? Because while it's great at the heavy lifting (sorting data, measuring, drawing), it still gets confused by tricky, borderline cases.
  • The Real Value: It acts as a super-efficient assistant. It does all the boring, repetitive math and data sorting in seconds, giving the doctor a "draft report" to review. The doctor can then focus on the tricky decisions, like "Is this weird spot actually cancer, or just inflammation?"

The Big Picture Analogy

Think of this AI as a GPS for a road trip.

  • The GPS (the AI Agent) can instantly plot the route, calculate the fuel, and tell you the traffic conditions (the quantitative data and draft report).
  • However, if there is a sudden landslide or a confusing detour (a complex medical case), the Driver (the human doctor) still needs to take the wheel and make the final decision.

In short: This paper proves that we can build an AI that manages the whole hospital workflow, not just one tiny task. It's a huge step toward making medical imaging faster and more consistent, but for now, it works best when it sits next to a human expert, not in their place.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →