Weakly Supervised Concept Learning with Class-Level Priors for Interpretable Medical Diagnosis

Imagine you are a doctor trying to diagnose a patient. You look at an X-ray or a skin mole and see a bunch of tiny details: "irregular edges," "dark spots," "white patches." In the medical world, these details are called concepts.

For years, AI has been great at looking at these images and saying, "That's cancer!" or "That's healthy!" But it's a black box. It gives you the answer, but it can't tell you why. It's like a student who gets the right answer on a math test but can't show their work. Doctors don't trust students who can't show their work, so they don't trust the AI either.

To fix this, researchers built "interpretable" AI. These are like students who must list the steps they took before giving the answer. But there's a huge problem: to teach the AI these steps, you need a human expert to label every single image with every single concept.

"This mole has an irregular border."
"This blood cell has a weird shape."
"This X-ray shows fluid."

Doing this for thousands of images is like asking a librarian to read every book in the library and write a summary for every single page. It takes too long, costs too much money, and experts get tired.

The New Idea: The "Gut Feeling" AI (PCP)

This paper introduces a new method called PCP (Prior-guided Concept Predictor). Think of it as teaching the AI using general rules instead of specific homework.

Here is the analogy:

1. The Old Way (Full Supervision)

Imagine you are training a new chef. You give them a photo of a specific pizza and say: "This pizza has pepperoni, extra cheese, and burnt crust." You do this for 1,000 different pizzas. The chef learns perfectly but takes forever to train.

2. The "Zero-Shot" Way (The Current Trend)

Some researchers tried to skip the training entirely. They gave the AI a giant encyclopedia (a huge language model) and said, "Just guess what's on the pizza based on your general knowledge."

The Problem: The AI knows what a "pepperoni pizza" looks like in general, but it doesn't know the specific medical details. It might confuse a "burnt crust" with a "scab" on a skin image. It's too vague and makes mistakes.

3. The PCP Way (Weak Supervision with Priors)

This is the paper's breakthrough. Instead of labeling every single image, you give the AI a rulebook (the "Priors").

The Rulebook: "If a patient has Melanoma (skin cancer), there is a 90% chance they have an 'irregular border' and a 70% chance they have 'blue-white patches'."
The Training: You don't tell the AI exactly what is in this specific photo. You just say, "Based on the fact that this photo looks like a Melanoma case, check if it has those features."

The AI looks at the image, guesses the features, and then checks its own guess against the rulebook.

The "Refinement" (The Magic Sauce): The AI has a "self-correcting" mechanism.
- KL Divergence: This is like a "Reality Check." If the AI thinks a cancer case never has a specific feature, but the rulebook says it usually does, the AI gets a gentle nudge to adjust its thinking.
- Entropy Regularization: This is like a "Focus Filter." It stops the AI from saying, "Maybe it's a little bit of everything." It forces the AI to be decisive: "It's definitely this feature, and definitely not that one."

Why This is a Big Deal

It's Cheaper and Faster: You don't need to hire experts to label thousands of images. You just need them to write down the general rules (the "Priors") once. It's like writing a recipe once instead of cooking the dish 1,000 times to prove it works.
It's Trustworthy: The AI still explains its reasoning ("I think this is cancer because I see an irregular border"), but it learned to see those borders without being explicitly told to look for them in every single photo.
It Works Better Than Guessing: The researchers tested this on skin images (dermoscopy) and blood cell images. The AI got the "concept" predictions right more than 33% better than the AI that just guessed based on general knowledge.

The Bottom Line

The authors built a system that teaches AI to be a medical detective without needing a human to point out every clue in every single case. Instead, they gave the AI a map of the territory (the class-level priors) and let it learn to find the clues on its own, while gently correcting itself to stay on track.

This means we can get AI that doctors can trust, explain, and use in real hospitals, without waiting years to collect millions of expensive, hand-labeled images. It's a smarter, faster way to teach machines how to "think" like a doctor.

1. Problem Statement

The deployment of AI in medical imaging is hindered by the "black box" nature of deep learning models, which limits clinical trust. While Interpretable-by-Design (IBD) frameworks like Concept Bottleneck Models (CBM) and Variational Information Pursuit (V-IP) offer transparency by mapping images to human-understandable concepts (e.g., "irregular streaks," "blue-whitish veil"), they face a critical bottleneck: data annotation.

The Challenge: Training these models requires exhaustive, per-concept annotations for every image. In medical contexts, this is prohibitively expensive, time-consuming, and often unreliable due to inter-expert disagreement on subtle features.
Limitations of Current Alternatives:
- Zero-shot Vision-Language Models (VLMs): Struggle to capture domain-specific medical features, leading to poor reliability.
- Post-hoc/Language-guided methods: Often rely on curated concept banks, large language models (LLMs), or predefined vocabularies that fail to generalize to subtle clinical findings or require textual supervision.
The Goal: Develop a framework that predicts medical concepts and diagnoses without explicit concept-level annotations and without relying on external VLMs for supervision.

2. Methodology: Prior-guided Concept Predictor (PCP)

The authors propose PCP, a weakly supervised framework that learns concept predictions by aligning image features with class-level concept priors rather than pixel-level concept labels.

A. Core Architecture

Backbone: Uses a ResNet (pretrained on ImageNet) to extract visual features from input images.
Projection: A bias-free linear layer maps visual features into a concept space ( $z$ ).
Surrogate Generation: Instead of ground-truth concept labels, the model generates surrogate concept vectors ( $\tilde{c}$ ) by sampling from Bernoulli distributions defined by class-level priors ( $P(c_m | y)$ ). These priors represent the statistical likelihood of a concept appearing in a specific disease class (e.g., "atypical pigment network" is highly likely in Melanoma).
Refinement Mechanism:
- The projected features are combined with the surrogate vectors via element-wise multiplication and a residual refinement mechanism: $z' = z \odot (1 + \beta \cdot \gamma(x))$ .
- This prevents the complete suppression of weak but potentially informative concepts, unlike simple masking approaches.

B. Training Objective (Composite Loss)

The model is trained using a composite loss function ( $L$ ) comprising four terms:

Triplet Loss ( $L_{trip}$ ): Ensures embeddings of the same class are close while different classes are pushed apart, maintaining discriminative power.
Class Matching Loss ( $L_{match}$ ): Computes the similarity between the predicted concept vector and the class priors, ensuring the model predicts concepts consistent with the target diagnosis.
KL Divergence Regularization ( $L_{KL}$ ): Minimizes the divergence between the predicted concept distribution and the expected class-level priors. This forces the model to respect known clinical statistics (e.g., if a concept is rare in a class, the model shouldn't predict it frequently).
Entropy Loss ( $L_{ent}$ ): Penalizes high entropy in the attention distribution, encouraging the model to focus sharply on relevant concepts and suppress irrelevant ones.

C. Inference

During inference, the model requires only the input image. The class-level priors used during training are not needed for prediction, allowing the model to output concept probabilities ( $\hat{c}$ ) and final diagnoses directly.

3. Key Contributions

Novel Framework (PCP): Introduces the first weakly supervised framework for medical concept prediction that eliminates the need for per-concept annotations and external VLMs.
Class-Level Priors: Demonstrates that easily obtainable class-level statistics (derived from experts or dataset summaries) can serve as effective weak supervision signals, bypassing the cost of exhaustive labeling.
Regularization Strategy: Proposes a specific combination of KL divergence and entropy regularization to align predictions with clinical reasoning while maintaining selectivity.
Generalizability: Successfully applies to diverse imaging modalities (Dermoscopy, Hematology, Chest X-Ray) and integrates with both CBM and V-IP architectures.

4. Experimental Results

The framework was evaluated on four datasets: PH2 (Dermoscopy), WBCatt (Hematology), HAM10000 (Dermoscopy), and CXR4 (Chest X-Ray).

A. Concept Prediction Performance

vs. Zero-shot VLMs: PCP significantly outperformed zero-shot baselines (CLIP, SigLIP, BioMedCLIP, ConceptCLIP).
- On WBCatt, PCP improved concept-level F1 by >33% compared to the best zero-shot baseline.
- On PH2, PCP achieved an F1 of 69.02% (with full regularization) compared to 32.96% for CLIP.
vs. Fully Supervised Baselines: While PCP does not reach the performance of models trained with ground-truth concept labels (Baseline), it achieves a strong trade-off, reaching ~88% of the fully supervised performance on WBCatt without the annotation cost.

B. Classification Performance

PCP-V-IP: Achieved classification performance comparable to fully supervised V-IP models (e.g., 87.50% F1 on PH2 vs. 90.00% for Vanilla-V-IP).
PCP-CBM: Showed competitive results on large datasets (HAM10000, CXR4) but struggled on small datasets (PH2) where noisy priors for critical concepts directly impacted the final classification (as CBM relies on all concepts).
Robustness: The V-IP variant proved more robust to imperfect priors because it adaptively queries only the most informative concepts, bypassing poorly predicted ones.

C. Ablation Study

Removing KL Regularization caused predictions to deviate significantly from clinical priors (e.g., failing to predict "atypical pigment network" in Melanoma cases).
Removing Entropy Loss resulted in diffuse attention weights, reducing the model's ability to select specific concepts.
The full model with both regularizers yielded the best accuracy and F1 scores.

5. Significance and Conclusion

Practicality: PCP offers a scalable solution for medical AI deployment. Class-level priors can be derived from a single expert consultation or dataset statistics, making the approach feasible for rare diseases or resource-constrained settings where full annotation is impossible.
Interpretability: The model produces clinically aligned reasoning patterns (e.g., identifying specific morphological features) without relying on "black box" VLMs, fostering greater trust among medical professionals.
Future Work: The authors note that in scenarios with very noisy priors (e.g., rare diseases with limited data), performance may degrade. Future directions include adaptive prior refinement and self-distilled concept reasoning to further improve robustness.

In summary, this paper demonstrates that reliable, interpretable medical diagnosis is achievable without explicit concept supervision, provided that the model is guided by statistically grounded class-level priors and regularized to align with clinical reasoning.