Geometrically Constrained Outlier Synthesis

Imagine you are teaching a robot to recognize different breeds of dogs. You show it thousands of pictures of Golden Retrievers, Poodles, and Beagles. The robot gets really good at this. But then, you show it a picture of a Wolf.

A standard AI might look at the Wolf, squint, and confidently say, "That's a very fluffy Poodle!" It's overconfident because it was never taught what a "non-dog" looks like. It only knows what is a dog, not what isn't.

This paper introduces a new training method called GCOS (Geometrically Constrained Outlier Synthesis) to fix this. It teaches the AI to say, "Wait a minute, that doesn't look like any dog I know," with much higher accuracy.

Here is how it works, explained with simple analogies:

1. The Problem: The "Safe Zone" Trap

Imagine the robot's brain creates a map of "Dog Land." All the pictures of Golden Retrievers form a big, cozy campfire in the middle of the map. The robot knows that if you are near the campfire, you are a dog.

The problem is that the robot doesn't know where the edge of the map is. If you show it a Wolf standing just outside the campfire, it might still think, "Close enough, that's a dog!" It needs to learn where the "Safe Zone" ends and the "Unknown Zone" begins.

2. The Old Way: Throwing Darts Blindly

Previous methods (like VOS) tried to teach the robot by generating fake "outlier" pictures. They would take a dog picture and randomly stretch or twist it, hoping to create something that looks weird.

The Flaw: It's like trying to teach someone the edge of a forest by throwing darts blindly into the sky. Sometimes you hit a tree (a real dog), and sometimes you hit a cloud (something too weird to be useful). The robot gets confused because the fake examples aren't realistic enough, or they are too easy to spot.

3. The New Way (GCOS): The "Geometric Map" Approach

GCOS is smarter. Instead of guessing, it looks at the shape of the data.

Step 1: Finding the "Quiet Corners"
Imagine the "Dog Land" campfire isn't a perfect circle; it's an oval. The robot uses a mathematical tool (Principal Component Analysis) to find the directions where the data is very tight and organized (the main axes of the oval) and the directions where the data is very sparse and quiet (the edges).
- Analogy: Think of a crowded dance floor. The "main directions" are where everyone is dancing. The "quiet corners" are the empty spaces near the walls. GCOS decides to generate fake outliers in those empty corners.
Step 2: The "Goldilocks" Shell
The robot needs to generate fake outliers that are just right.
- If they are too close to the real dogs, the robot can't tell them apart.
- If they are too far away, the robot will easily say, "That's not a dog!" and ignore them.
- The Solution: GCOS uses a "Conformal Shell." Imagine a protective bubble around the campfire. The robot generates fake dogs inside this bubble, but right near the edge.
- Analogy: It's like a coach standing right at the edge of the playing field, tossing balls just over the line. The players (the AI) have to learn exactly where the line is, not by guessing, but by practicing right on the boundary.
Step 3: The "Strangeness" Test
The robot uses a special score (like a "strangeness meter") to check these fake outliers. It adjusts the distance until the fake outlier is "strange enough" to be an outlier, but "close enough" to be a challenge. This ensures the robot learns a smooth, precise boundary around the real data.

4. Why This Matters: The "Near-Miss" Challenge

Most AI tests use "Far-Outliers" (e.g., showing a cat to a dog classifier). That's easy. The hard part is "Near-Outliers" (e.g., showing a Wolf to a dog classifier).

GCOS shines here. Because it builds the boundary based on the actual shape of the dog data, it can tell the difference between a Golden Retriever and a Wolf much better than older methods. It doesn't just memorize; it understands the geometry of what a dog looks like.

5. The "Statistical Guarantee" Bonus

The paper also mentions a cool side feature. Usually, AI says, "I'm 90% sure this is a dog." But what if it's wrong?
GCOS can translate that confidence into a statistical guarantee. It's like a weather forecast that says, "There is a 95% chance of rain, and we promise that if we say 'rain' 100 times, it will actually rain 95 of those times." This makes the AI much more trustworthy for critical jobs, like medical diagnosis.

Summary

Old AI: "I see a dog shape, so I'll guess it's a dog." (Overconfident).
GCOS AI: "I know exactly where the 'dog shape' ends. This new thing is just outside that line, so I will flag it as unknown." (Cautious and accurate).

By generating smart, geometrically precise "fake weirdos" during training, GCOS teaches the AI to respect the boundaries of its own knowledge, making it safer and more reliable in the real world.

1. Problem Statement

Deep neural networks for image classification often exhibit overconfidence when encountering Out-of-Distribution (OOD) samples. Standard training minimizes classification error on in-distribution (ID) data but fails to learn robust decision boundaries that can distinguish between known classes and novel, unexpected inputs.

Existing methods face two primary limitations:

Synthesis Limitations: Prior approaches like Virtual Outlier Synthesis (VOS) generate synthetic outliers by sampling from simple parametric distributions (e.g., Gaussian tails) outside the ID support. This often fails to capture the complex, non-Gaussian, and structured nature of real-world anomalies, leading to outliers that are either too easy to detect or indistinguishable from ID data.
Benchmark Limitations: Most research focuses on far-OOD benchmarks (e.g., classifying animals vs. textures). However, the more critical challenge for robust AI is near-OOD detection, where outliers share the same semantic domain as ID data (e.g., unseen dog breeds or different stages of a disease) but differ in fine-grained features. These cases are harder to detect due to high feature-space similarity.

2. Methodology: Geometrically Constrained Outlier Synthesis (GCOS)

GCOS is a training-time regularization framework designed to improve OOD robustness by generating virtual outliers that respect the learned manifold structure of ID data. It operates in two main stages:

A. Geometric Synthesis via PCA

Instead of assuming a global Gaussian distribution, GCOS analyzes the hidden feature space of the backbone network:

Subspace Extraction: It performs Principal Component Analysis (PCA) on the training features to identify low-variance subspaces (directions with small eigenvalues).
Off-Manifold Directions: These low-variance directions represent "off-manifold" paths. Moving along these directions from the class centroid generates points that are statistically unlikely (OOD) but remain geometrically close to the data cluster, avoiding trivial outliers.
Direction Selection: The method can either average these small eigenvectors or select them individually to generate outlier candidates.

B. Conformal Shell Constraint

To control the "difficulty" of the synthesized outliers (ensuring they are neither too close to ID data nor too obvious), GCOS employs a Conformal Prediction-inspired heuristic:

Calibration: A calibration set is used to compute nonconformity scores (e.g., Mahalanobis distance or Energy scores) for ID features.
Conformal Shell: The method defines a "shell" of deviation magnitudes ( $\alpha$ ) bounded by empirical quantiles (e.g., 95th and 99th percentiles) of the nonconformity scores.
Synthesis: Outliers are generated by sampling a scalar $\alpha$ uniformly from this shell: $z_{ood} = \mu + \alpha v$ . This ensures generated outliers fall into a "hard-negative" region—challenging enough to force learning but distinct enough to be identifiable.

C. Contrastive Regularization Loss

The synthesized outliers are used to train the model via a contrastive objective. The loss function ( $L_{reg}$ ) aims to:

Minimize the nonconformity score of ID samples.
Maximize the nonconformity score of synthesized OOD samples.
Hybrid Approach: The paper proposes a hybrid strategy where outliers are synthesized using Mahalanobis distance (geometric) to ensure they lie in valid off-manifold regions, but the regularization loss uses Energy-based scores (logits) to directly optimize the classifier's confidence landscape.

3. Key Contributions

Geometric Outlier Synthesis: A novel method that replaces parametric distribution assumptions with a geometry-driven approach using PCA and low-variance subspaces to generate realistic, challenging outliers.
Conformal Heuristic for Synthesis: The introduction of a "conformal shell" to adaptively control the magnitude of outlier synthesis, ensuring a balance between separability and difficulty.
Focus on Near-OOD: The framework is explicitly evaluated and optimized for near-OOD tasks (fine-grained classification), addressing a gap in current literature that prioritizes far-OOD benchmarks.
Conformal OOD Inference (Exploratory): The paper explores transitioning the framework to Conformal Hypothesis Testing at inference time. This converts uncertainty scores into statistically valid p-values, offering formal error guarantees (e.g., controlling False Negative Rates) rather than heuristic thresholds.

4. Experimental Results

The authors evaluated GCOS on four datasets, emphasizing near-OOD scenarios:

Datasets: Colored MNIST (digit-color permutation), Stanford Dogs (fine-grained breeds), MVTec AD (industrial defects), and Retinopathy (eye disease severity vs. other pathologies).
Performance:
- GCOS achieved a state-of-the-art average AUROC of 93.47%, outperforming strong baselines like NCIS (91.97%) and Dream-OOD (85.76%).
- It demonstrated significant improvements in FPR95 (False Positive Rate at 95% True Positive Rate), achieving near-zero false positives on the Stanford Dogs dataset (0.00%) and Colored MNIST (1.00%).
- Visualizations (UMAP): Feature space visualizations show that GCOS generates outliers in "off-manifold" regions that tightly enclose the decision boundary around data clusters, whereas VOS outliers tend to cluster near class boundaries or in irrelevant regions.
Conformal Inference: While the exploratory conformal hypothesis testing showed mixed results compared to energy-based inference (collapsing to random on some datasets), it achieved competitive results on Colored MNIST with formal statistical guarantees.

5. Significance and Impact

Robustness: GCOS provides a principled way to train models that are less overconfident on novel inputs, particularly in safety-critical domains where near-OOD failures are catastrophic.
Efficiency: Unlike diffusion-based synthesis methods (e.g., Dream-OOD, NCIS) which are computationally expensive, GCOS is lightweight, relying on linear algebra (PCA) and standard training loops.
Theoretical Bridge: The work bridges the gap between OOD detection and Conformal Prediction. By using conformal heuristics during training and exploring formal guarantees during inference, it offers a pathway toward predictive systems with statistically valid uncertainty quantification.
Generalizability: The method is applicable to both classification and object detection tasks, showing consistent improvements across diverse domains including medical imaging and industrial anomaly detection.

In summary, GCOS advances the field by moving away from heuristic outlier generation toward a geometry-aware, statistically grounded framework that produces more robust decision boundaries for both near and far OOD detection.