VesselFusion: Diffusion Models for Vessel Centerline Extraction from 3D CT Images

This paper introduces VesselFusion, a diffusion model-based approach that utilizes coarse-to-fine representation and voting-based aggregation to achieve more accurate and natural vessel centerline extraction from 3D CT images compared to conventional deterministic methods.

Soichi Mita, Shumpei Takezaki, Ryoma Bise

Published 2026-03-10
📖 4 min read☕ Coffee break read

Imagine you are looking at a 3D CT scan of a human body. Inside, there is a complex, branching network of blood vessels, like a dense forest of tiny, winding rivers. Doctors need to map out the exact path of these rivers (the "centerlines") to plan surgeries or diagnose diseases.

The problem? Drawing these paths by hand is incredibly tedious, and drawing them automatically with old computer programs is messy. Old programs are like rigid robots: they follow strict rules and often get confused by the foggy, blurry edges of the vessels, resulting in broken lines, dead ends, or weird loops that don't exist in real life.

Enter VesselFusion. Think of this new method not as a rigid robot, but as a team of expert artists working together to recreate a map of a forest.

Here is how it works, broken down into simple steps:

1. The "Sketch First, Detail Later" Approach (Coarse-to-Fine)

Imagine trying to draw a complex tree. If you try to draw every single leaf and twig at full size immediately, you'll likely mess up the proportions. It's better to start with a rough sketch of the main branches, then zoom in to add the details.

VesselFusion does exactly this. Instead of trying to guess the exact millimeter-perfect coordinate of every point in the vessel all at once, it breaks the job into two parts:

  • The Grid (The Sketch): It first figures out which "neighborhood" or grid block the vessel is in.
  • The Offset (The Detail): Once it knows the neighborhood, it calculates the tiny, precise distance from the center of that block to the actual vessel.

This two-step process makes it much easier for the AI to learn the shape without getting overwhelmed by the sheer amount of data.

2. The "Dreaming" Process (Diffusion Models)

Traditional AI models are like a student taking a test: they look at the question (the CT scan) and give one single answer. If they are wrong, the answer is just wrong.

VesselFusion uses a Diffusion Model, which is more like an artist refining a sketch.

  • Imagine starting with a piece of paper covered in static noise (like TV snow).
  • The AI slowly "denoises" this picture, step-by-step, guided by the CT scan image.
  • With each step, the noise turns into a clearer picture of the vessel.
  • Because the AI has "learned" what healthy blood vessels look like from thousands of examples, it knows to avoid creating impossible shapes (like a vessel suddenly turning into a square or a loop that shouldn't be there). It captures the variability of nature, understanding that vessels can look slightly different in every person.

3. The "Council of Experts" (Voting-Based Aggregation)

Here is the catch: because the AI starts with random "noise" (like a random sketch), one single attempt might still produce a weird result—maybe a tiny tear in the line or a stray loop.

To fix this, VesselFusion doesn't just ask for one answer. It asks 100 different "versions" of itself to draw the map, each starting with a slightly different random noise.

  • The Analogy: Imagine asking 100 different cartographers to draw the same river.
  • The Voting: Some might draw a loop that doesn't exist; others might miss a small branch. But the real river will appear in almost all 100 drawings.
  • The Result: The system looks at all 100 maps and only keeps the parts where the experts agree (the "voting"). This filters out the weird mistakes and leaves a perfect, stable, and natural-looking vessel map.

Why is this a big deal?

  • Old Methods: Like a GPS that gets stuck in a loop or drops you in a field because the signal was fuzzy.
  • VesselFusion: Like a team of experienced hikers who know the terrain. Even if one hiker takes a wrong turn, the group consensus ensures you end up on the right path.

The Bottom Line:
VesselFusion is the first tool to use this "generative" and "voting" approach to map blood vessels. It produces maps that are not only more accurate (hitting the right coordinates) but also look more "human" and natural, avoiding the broken or impossible shapes that plague older technologies. This means doctors can trust the computer's map more, saving time and potentially saving lives.