ExDD: Explicit Dual Distribution Learning for Surface Defect Detection via Diffusion Synthesis

Imagine you are a quality control inspector at a factory that makes copper pipes and marble tiles. Your job is to spot tiny scratches, dents, or weird spots on the products before they get shipped out.

The Problem: The "Perfect" Inspector Who Only Knows "Normal"

Traditionally, computers were taught to do this job using a method called One-Class Anomaly Detection. Think of this like training a security guard who has only ever seen perfect, flawless products.

The guard memorizes what a "perfect" pipe looks like. If they see something that doesn't look exactly like the perfect pipe, they scream, "DEFECT!"

The Flaw: This works okay for random noise, but industrial defects are often specific. A scratch looks like a scratch; a dent looks like a dent. They aren't just "random weirdness"; they have their own specific shapes and patterns.
The Data Scarcity: The real problem is that in a factory, defects are rare. You might have 1,000 perfect pipes but only 5 broken ones. It's impossible to teach the guard what a "scratch" looks like if you only show them 5 examples.

The Solution: ExDD (Explicit Dual Distribution)

The authors of this paper, Muhammad Aqeel and his team, created a new system called ExDD. Instead of just memorizing "perfect," they teach the computer to understand two distinct worlds: Normal and Defective.

Here is how they did it, using some fun analogies:

1. The Two Filing Cabinets (Dual Memory Banks)

Instead of one big brain, ExDD uses two separate filing cabinets (Memory Banks):

The "Normal" Cabinet: Filled with photos of perfect pipes and tiles.
The "Defect" Cabinet: Filled with photos of scratches, dents, and spots.

Why is this cool? Old methods only had the "Normal" cabinet. If a new type of scratch appeared that the guard hadn't seen before, they might miss it because it didn't look "different enough" from the normal ones. By having a "Defect" cabinet, the computer can say, "Hey, this looks a lot like the scratches in the Defect cabinet, and very little like the Normal cabinet. It's a defect!"

2. The Magic Art Generator (Diffusion Synthesis)

Here is the tricky part: The "Defect" cabinet is empty at the start because real defective samples are rare. You can't fill a cabinet with only 5 photos.

To solve this, the team used a Latent Diffusion Model (think of it as a super-smart AI artist, like DALL-E or Midjourney).

The Trick: They give the AI a picture of a perfect pipe and a text prompt like "copper metal scratch" or "white mark on the wall."
The Result: The AI generates new, fake images of scratches that look incredibly real and fit perfectly into the factory context.
The Benefit: They can now fill the "Defect" cabinet with hundreds of high-quality examples, even though they only had a few real ones to start with. It's like having a photocopier that can create infinite variations of a scratch so the inspector learns every possible way a scratch can look.

3. The "Ratio" Score (The Final Decision)

When the system checks a new product, it doesn't just ask, "Is this weird?" It asks two questions and compares the answers:

How far is this from "Normal"? (Distance to the Normal Cabinet)
How close is this to "Defect"? (Distance to the Defect Cabinet)

The system calculates a Ratio:

If the item is FAR from Normal AND CLOSE to Defect, it's a definite defect!

This is much smarter than just looking for "weirdness." It's like a detective who doesn't just look for suspects who act strangely, but specifically looks for people who match the profile of the criminal and don't match the profile of an innocent bystander.

The Results

They tested this on a real industrial dataset (KSDD2).

The Old Way: Good at spotting big problems, but missed subtle scratches.
ExDD (with the AI-generated defects): Caught 97.7% of the defects and pinpointed exactly where they were on the product.

The Takeaway

The paper teaches us that in the world of industrial quality control, you don't need to wait for a million broken products to learn what they look like. Instead, you can use AI to imagine the broken ones, teach the computer to recognize the specific patterns of "broken" versus "perfect," and build a system that is far more accurate and reliable.

It's the difference between a guard who only knows what a "good" day looks like, and a detective who knows exactly what a "bad" day looks like, too.

1. Problem Statement

Industrial surface defect detection faces two primary challenges:

Data Scarcity: Defective samples are rare in production lines, making supervised learning difficult. Consequently, the field has shifted toward one-class anomaly detection, which trains exclusively on normal data.
Flawed Assumptions: Traditional one-class methods assume that anomalies are uniformly distributed outliers in the complement space of normal data. However, real-world industrial defects (e.g., scratches, cracks) often form distinct, structured feature distributions rather than random noise.
Synthetic Data Limitations: Existing synthetic data generation methods (e.g., GANs, random noise) often produce out-of-distribution artifacts that do not align with real defect patterns, leading to poor feature learning and suboptimal detection.

2. Methodology: The ExDD Framework

The authors propose ExDD (Explicit Dual Distribution), a unified framework that moves beyond one-class paradigms by explicitly modeling both normal and anomalous feature distributions. The framework consists of four core components:

A. Dual Memory Bank Architecture

Instead of a single memory bank for normal features, ExDD utilizes two parallel memory banks:

Negative Memory Bank ( $M_N$ ): Stores patch-level features extracted from nominal (defect-free) images.
Positive Memory Bank ( $M_P$ ): Stores patch-level features extracted from anomalous images.
- Key Innovation: The positive bank is populated using diffusion-synthesized defects rather than relying solely on scarce real defects. This allows the model to learn the specific statistical properties of the defect distribution.
- Feature Extraction: Uses a pre-trained ResNet backbone (WideResNet50) to extract features from layers 2 and 3. Features are aggregated using locally aware patch descriptors (3x3 windows) and concatenated.
- Dimensionality Reduction: High-dimensional features (1536 channels) are reduced to 128 dimensions via random projection (Johnson-Lindenstrauss lemma) and further compressed using greedy coreset subsampling (2% for normal, 10% for defect to preserve diversity).

B. Diffusion-Based Synthetic Anomaly Generation

To overcome data scarcity and ensure the positive memory bank contains realistic defects:

Model: Utilizes Latent Diffusion Models (LDMs), specifically Stable Diffusion XL.
Process: The system employs text-conditional inpainting. Given a normal image, a binary mask (derived from real defect locations or generated), and a domain-specific text prompt (e.g., "copper metal scratches"), the model generates synthetic defects that preserve the industrial context and geometric fidelity.
Integration: These synthetic images are treated as ground-truth anomalies, allowing the system to populate the Positive Memory Bank with diverse, in-distribution defect patterns.

C. Neighborhood-Aware Ratio Scoring

The detection mechanism fuses information from both memory banks to create a robust decision boundary:

Distance Metrics:
- $s_N^*$ : Minimum Euclidean distance from a test patch to the Negative memory bank (deviation from normality).
- $s_P^*$ : Minimum Euclidean distance from a test patch to the Positive memory bank (similarity to known defects).
Weighting: A neighborhood-aware weighting mechanism adjusts scores based on local density. If a test patch is far from normal neighbors but close to defect neighbors, the signal is amplified.
Ratio Score: The final anomaly score is calculated as a ratio:
$s_{ratio} = \frac{s_N}{s_P + \epsilon}$
- Logic: A high ratio indicates the patch is dissimilar to normal patterns (high numerator) and similar to defect patterns (low denominator), effectively suppressing false positives caused by normal texture variations.

3. Key Contributions

Dual Distribution Learning: Formalizes defect detection as a separation problem between two explicit feature distributions (normal vs. defect) rather than a one-class outlier problem.
Diffusion-Augmented Training: Introduces a text-conditional LDM pipeline that generates in-distribution synthetic defects, bridging the gap between generation and detection.
Ratio Scoring Mechanism: Proposes a novel scoring metric that leverages the dual memory structure to amplify true anomaly signals while suppressing false positives.

4. Experimental Results

The framework was evaluated on the KSDD2 dataset (industrial surface defects like scratches and spots).

Performance Metrics:
- Image-level AUROC (I-AUROC): 94.2% (State-of-the-art).
- Pixel-level AUROC (P-AUROC): 97.7% (State-of-the-art).
Comparisons:
- Outperforms PatchCore (91.2% I-AUROC, 95.8% P-AUROC) by significant margins.
- Significantly surpasses DRAEM and DSR, particularly in pixel-wise localization (DRAEM achieved only 42.4% P-AUROC).
- Matches or exceeds IRP in detection but provides superior pixel-level localization capabilities which IRP lacks.
Ablation Study (Augmentation):
- Performance peaks with 100 synthetic samples (50 per prompt).
- Increasing samples beyond 100 yields diminishing returns or slight performance drops, indicating an optimal balance between diversity and distribution shift.

5. Significance

Paradigm Shift: ExDD challenges the prevailing one-class anomaly detection dogma by proving that explicitly modeling the defect distribution leads to superior separability.
Practical Utility: By generating realistic, context-aware synthetic defects, the method solves the critical bottleneck of data scarcity in industrial settings without requiring extensive manual annotation of rare defects.
Precision: The ratio scoring mechanism provides highly precise localization, which is crucial for automated quality control systems where false positives can halt production lines unnecessarily.
Future Direction: The work establishes a foundation for adaptive memory dynamics and uncertainty quantification in data-constrained industrial environments.