Diff2DGS: Reliable Reconstruction of Occluded Surgical Scenes via 2D Gaussian Splatting

Diff2DGS is a novel two-stage framework that combines diffusion-based video inpainting with a learnable deformation model adapted for 2D Gaussian Splatting to achieve high-fidelity, real-time 3D reconstruction of occluded and deformable surgical scenes with improved depth accuracy.

Tianyi Song, Danail Stoyanov, Evangelos Mazomenos, Francisco Vasconcelos

Published 2026-02-23
📖 4 min read☕ Coffee break read

Imagine you are trying to build a perfect, 3D holographic map of a delicate surgery happening inside a patient's body. You have a video camera (the robot's eye) recording the scene. But there's a huge problem: the surgeon's tools (scissors, clamps, etc.) keep blocking the view, hiding the soft, squishy tissues underneath.

If you try to build a 3D map while those tools are in the way, your map will have giant holes or "glitches" where the tools are. It's like trying to draw a picture of a landscape, but someone keeps holding a giant black sign in front of your face. You can't see the mountains behind the sign, so you just leave a black hole in your drawing.

Diff2DGS is a new, clever system designed to fix this problem. Think of it as a two-step "magic repair kit" for surgical videos.

Step 1: The "Time-Traveling Art Restorer" (The Inpainting Stage)

First, the system looks at the video and finds all the parts covered by surgical tools. It doesn't just guess what's underneath; it uses a special kind of AI called a Diffusion Model.

Imagine you are looking at an old, damaged painting where a piece is missing. A normal AI might just guess a random color to fill the hole. But this Diffusion Model is like a master art restorer who has seen thousands of similar paintings. It looks at the frames before and after the tool moved. It understands how the tissue moves and flows over time.

It essentially says, "Okay, I know what this tissue looked like a second ago, and I know how it's stretching right now. I can 'paint' over the tool with a perfect, realistic version of the tissue that should be there." This creates a clean video where the tools have vanished, and the hidden tissue is revealed with high consistency.

Step 2: The "Stretchy Clay Sculptor" (The 2D Gaussian Splatting Stage)

Now that we have a clean video, we need to turn it into a 3D model. Traditional methods are like trying to build a statue out of rigid, hard clay. If the tissue moves or stretches (which it does a lot in surgery), the hard clay cracks or looks fake.

Diff2DGS uses a technique called 2D Gaussian Splatting. Imagine instead of hard clay, you are using thousands of tiny, flat, stretchy stickers (or "splats") that can float in 3D space.

  • The Magic Trick: The system adds a special "Learnable Deformation Model." Think of this as giving the stickers a memory of how they stretch and twist. When the tissue moves, these stickers don't just break; they stretch and slide smoothly, just like real skin and muscle.
  • The Result: You get a 3D model that looks incredibly real and moves naturally, even when the tissue is being pulled or pushed.

Why is this a big deal?

Most previous methods had two major flaws:

  1. They ignored the holes: They tried to build the 3D map even while the tools were blocking the view, leading to blurry, glitchy spots.
  2. They cared only about the picture, not the depth: They made the image look pretty (like a high-quality photo), but if you looked at the 3D shape from a different angle, the depth was wrong. It was like a flat painting that looked 3D from the front but collapsed when you walked to the side.

Diff2DGS fixes both:

  • It "erases" the tools first, so the 3D map has no holes.
  • It uses a special "depth loss" training method. Imagine a teacher who doesn't just grade your drawing on how colorful it is, but also checks if the mountains are the right height. This ensures the 3D shape is accurate, not just pretty.

The Bottom Line

The researchers tested this on real surgical robot videos. The results were impressive:

  • The 3D models were sharper and more accurate than any previous method.
  • The "hidden" areas behind the tools were reconstructed perfectly.
  • The system is fast enough to potentially work in real-time, which is crucial for helping surgeons navigate or for training robots to do surgery autonomously.

In short, Diff2DGS is like giving a surgeon a pair of X-ray glasses that can see through the tools, combined with a sculptor who can instantly mold a perfect, moving 3D map of the inside of the body, ensuring nothing is hidden and everything is measured correctly.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →