Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification

This paper introduces Ctrl-GenAug, a novel generative augmentation framework that addresses the scarcity of medical datasets by enabling controllable, semantically and sequentially coherent synthesis of diagnosis-promotive samples while filtering out unreliable data to significantly improve medical sequence classification performance.

Xinrui Zhou, Yuhao Huang, Haoran Dou, Shijing Chen, Ao Chang, Jia Liu, Weiran Long, Jian Zheng, Erjiao Xu, Jie Ren, Alejandro F. Frangi, Ruobing Huang, Jun Cheng, Xiaomeng Li, Wufeng Xue, Dong Ni

Published 2026-02-19
📖 4 min read☕ Coffee break read

Imagine you are trying to teach a student (an AI) how to be a doctor. To do this, you need to show them thousands of examples of healthy and sick patients. But here's the problem: in the real world, serious diseases are rare, and getting a doctor to label (annotate) every single medical scan is slow, expensive, and exhausting. It's like trying to teach someone to recognize a rare type of bird when you only have three photos of it, and the rest of your photo album is full of sparrows.

This is where Ctrl-GenAug comes in. Think of it as a super-smart, controllable "Imagination Machine" that creates fake but realistic medical videos to fill in the gaps.

Here is a simple breakdown of how it works, using some everyday analogies:

1. The Problem: The "Empty Classroom"

Medical AI models are like students who need to study hard to pass exams. But the "textbooks" (medical datasets) are often:

  • Too small: Not enough examples of rare diseases.
  • Unbalanced: Too many examples of mild cases, too few of severe ones.
  • Fragile: If the student studies only in one specific hospital, they might fail when they see a patient from a different hospital with slightly different equipment.

2. The Solution: The "Imagination Machine" (The Generator)

Instead of just copying existing pictures, Ctrl-GenAug uses a Diffusion Model. Think of this like a master artist who starts with a blank canvas covered in static (noise) and slowly paints a clear picture.

But a normal artist might paint a weird monster when you ask for a "mild heart condition." Ctrl-GenAug is special because it has four specific instruction manuals (conditions) to ensure the painting is exactly what the doctor needs:

  • The Label: "This is a severe case." (Like telling the artist the genre of the movie).
  • The Description: "The nodule has a smooth edge." (Like giving a detailed script).
  • The Reference Photo: A real scan of a similar patient to copy the style. (Like a mood board).
  • The Motion Map: A guide showing how the heart should beat or how blood flows. (Like a storyboard for movement).

By combining these, the machine doesn't just make any fake video; it makes a customized, high-quality video that looks and moves exactly like a real patient with a specific disease.

3. The Safety Net: The "Quality Control Inspector" (The Filter)

Here is the tricky part: Even the best AI sometimes hallucinates. It might generate a video that looks cool but is medically nonsense (e.g., a heart beating backward or a tumor in the wrong place). If you feed these "bad fake videos" to the student doctor, they will get confused and fail the exam.

Ctrl-GenAug has a built-in Inspector (the Noisy Synthetic Data Filter).

  • The Semantic Check: The Inspector asks, "Does this video actually look like the disease label we gave it?" If the AI said "Severe" but the video looks "Mild," the Inspector throws it in the trash.
  • The Motion Check: The Inspector checks if the movement is smooth. If the video flickers or jumps weirdly, it gets rejected.
  • The Variety Check: The Inspector makes sure we aren't just generating the exact same fake video 1,000 times. It ensures we have a diverse library of examples.

4. The Result: A Super-Student

Once the "Imagination Machine" creates thousands of custom videos and the "Inspector" filters out the bad ones, the AI student gets a massive, diverse, and high-quality library to study.

The paper tested this on 5 different medical datasets (heart, lungs, thyroid, knee, and carotid arteries) and compared it against 15 other methods.

  • The Outcome: The AI students trained with Ctrl-GenAug became significantly better at diagnosing diseases.
  • The Special Win: They were especially good at spotting rare, high-risk diseases (which usually get ignored because there aren't enough examples) and they didn't get confused when tested on data from different hospitals (out-of-domain).

The Big Picture

Think of Ctrl-GenAug as a bridge. It takes the limited, expensive real-world medical data and builds a massive, safe, and diverse training ground on the other side. This allows AI to learn faster, spot rare diseases better, and become a more reliable assistant for real doctors, all while saving time and money on manual labeling.

In short: It's a tool that teaches AI to imagine rare medical scenarios perfectly, checks its work to ensure it's not lying, and uses those imaginary scenarios to make real-world medical diagnosis more accurate.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →