Provably Safe Generative Sampling with Constricting Barrier Functions

The Big Problem: The "Wild Artist"

Imagine you have a brilliant, hyper-talented artist (the Generative Model, like a Diffusion model). This artist can paint incredibly realistic scenes, design complex molecules, or plan robot movements just by looking at a picture of static noise and slowly refining it into a masterpiece.

However, this artist has a flaw: they don't know the rules.

If you ask them to draw a car, they might draw one with wheels made of jelly (violating physics).
If you ask them to generate a robot's movement, they might tell the robot to spin its arm 360 degrees instantly (breaking the robot's motors).
If you ask for a picture of a bedroom, they might accidentally draw a window where a wall should be.

Current methods to fix this are like giving the artist a gentle suggestion: "Hey, maybe don't use jelly wheels?" The artist listens, but they might still mess up. Other methods are like taking the finished painting and forcibly painting over the mistakes with a thick brush. This fixes the error, but it ruins the texture and makes the painting look fake.

The Solution: The "Constricting Safety Tube"

The authors propose a new way to guide the artist. Instead of shouting instructions at the end or giving vague hints at the start, they build a Safety Tube around the artist's creative process.

Think of the artist's process like a baking a cake:

The Start (High Noise): You start with a bowl of chaotic, unidentifiable ingredients (flour, eggs, sugar mixed randomly). At this stage, the cake has no shape.
The Middle: The batter starts to take form.
The End (Low Noise): The cake is fully baked and ready to eat.

The authors' method works like a collapsible mold that fits around the cake batter as it bakes:

At the beginning (The Chaos Phase): The mold is huge and loose. It doesn't care if the batter is messy or in the wrong spot. This is important because the artist needs total freedom to decide the "big picture" (the shape of the cake, the general style). If you forced the batter into a tight shape too early, you'd ruin the texture.
During the process: As the batter settles and the cake starts to rise, the mold slowly shrinks. It gently nudges the batter toward the center.
At the end (The Final Form): The mold has shrunk down to the exact size of the perfect cake. By the time the cake is done, it is guaranteed to be inside the safe zone.

How It Works: The "Gentle Nudge"

The paper uses a mathematical tool called a Control Barrier Function (CBF). In our analogy, this is the smart mold.

It Cooperates, Not Overrides: The mold doesn't force the artist to stop painting. It only steps in when the artist is about to step outside the tube.
It's Cheaper to Fix Early: The paper makes a brilliant observation: It is much easier to fix a mistake when the image is just "noise" (the beginning) than when it is a detailed photo (the end).
- Analogy: If you are drawing a face and you accidentally put the eyes on the forehead, it's easy to erase and move them while the paper is still blank. But if you've already colored the whole face and added shading, moving the eyes now would ruin the whole picture.
- The authors' method does the "heavy lifting" of safety enforcement when the image is still just noise (cheap to fix) and lets the artist do the fine details (expensive to fix) on their own.
The Math Magic (The QP): At every tiny step of the drawing process, the computer solves a quick math puzzle (a Quadratic Program). This puzzle asks: "What is the absolute smallest nudge I can give the artist to keep them inside the tube?" This ensures the final image looks exactly like the artist intended, just with the safety rules applied.

Real-World Examples from the Paper

The authors tested this on three very different things:

Physics (The Lorenz System):
- The Task: Generate a path for a chaotic weather system.
- The Problem: The artist's random guesses often broke the laws of physics (e.g., the wind blowing uphill).
- The Result: The "Safety Tube" guided the path so that it followed the laws of physics perfectly, even though the artist started with random noise.
Images (Bedrooms):
- The Task: Generate a bedroom image, but force a specific window to appear in a specific spot.
- The Problem: Old methods would either ignore the window or paint a black rectangle over the whole bottom of the image (ruining the furniture).
- The Result: The "Safety Tube" ensured the window appeared exactly where requested, but the rest of the room (bed, lamps, lighting) looked natural and beautiful.
Robotics (Pushing a Block):
- The Task: Tell a robot arm how to push a block.
- The Problem: The robot's plan was jerky and would have broken its motors (too much speed change).
- The Result: The "Safety Tube" smoothed out the robot's movements. The robot pushed the block successfully without shaking or breaking, all while keeping the original plan's goal.

Why This Matters

This paper is a game-changer because it allows us to use powerful AI models in safety-critical situations (like self-driving cars or medical devices) without having to retrain the AI or make it less smart.

Old Way: "Don't do that!" (The AI ignores you).
Old Way 2: "Fix it after you're done!" (The AI looks broken).
New Way: "I'm holding a safety net that gets tighter as you work, so you can't fall, but you can still fly."

The result is a system that is 100% safe (mathematically guaranteed) but still 100% creative and faithful to the original AI's style.

1. Problem Statement

Flow-based generative models (e.g., Diffusion Models, Flow Matching) have achieved state-of-the-art performance in learning complex data distributions. However, their deployment in safety-critical domains (robotics, autonomous navigation, medical imaging) is hindered by a lack of formal guarantees that generated samples will satisfy hard constraints.

Limitations of Existing Methods:
- Soft Guidance (Classifier/Reward-based): These act as probabilistic incentives. They bias the model toward desired regions but cannot guarantee that a sample will strictly satisfy constraints (e.g., collision avoidance, physical laws).
- Projection-based Methods: These project samples onto a safe manifold after generation. While they offer safety, they often introduce significant distributional shifts (altering the semantic content of the image or trajectory) and incur high computational overhead.
The Gap: There is a need for a framework that enforces hard constraints (100% satisfaction) while minimizing the distributional shift from the original pre-trained model, without requiring retraining or architectural changes.

2. Methodology

The authors propose a safety filtering framework that acts as an online "shield" for any pre-trained flow-based generative model. The core insight is to cooperate with the generative process rather than overriding it, utilizing Control Barrier Functions (CBFs).

A. Constricting Safety Tube

The method introduces a time-varying "safety tube" $\tilde{C}(t)$ that evolves alongside the sampling process (which runs in reverse time from $t=T$ to $t=0$ ):

Initial State ( $t=T$ ): The tube is highly relaxed to encompass the initial noise distribution (Gaussian), acknowledging that the raw noise likely violates constraints.
Progression: As sampling proceeds ( $t \to 0$ ), the tube progressively constricts (tightens).
Final State ( $t=0$ ): The tube collapses exactly onto the target safe set $C$ .
Mechanism: This mirrors the "coarse-to-fine" structure of flow-based models. Interventions are applied when the model is establishing global structure (high noise), where they are "distributionally cheap," and minimized when the model is refining fine details (low noise).

B. Control Synthesis via Quadratic Programming (QP)

At each sampling step, the framework synthesizes a feedback control input $u$ to ensure the trajectory remains within the safety tube.

Dynamics: The sampling process is modeled as a Stochastic Differential Equation (SDE): $dx = [f_\theta(x, t) + u]dt + g(t)dw$ .
CBF Condition: A time-varying barrier function $\tilde{h}(x, t)$ is defined. The control $u$ is synthesized to satisfy the reverse-time CBF condition:
$\nabla \tilde{h} \cdot (f_\theta + u + g\xi) + \frac{\partial \tilde{h}}{\partial t} \leq \gamma(\tilde{h})$
Optimization: The control $u$ is found by solving a convex Quadratic Program (QP) at every step:
$\min_u \frac{1}{2}\|u\|^2 \quad \text{s.t. CBF constraint}$
This minimum-norm approach ensures the intervention is as small as possible, preserving the learned distribution.

C. Theoretical Guarantees

Theorem 4.1 (Reverse Invariance): Proves that if the CBF condition is satisfied at every step, the final sample $x(0)$ is guaranteed to be in the safe set $C$ , regardless of the initial noise location or the convexity of $C$ .
Theorem 4.2 (Distributional Shift): Proves that the minimum-norm control minimizes the instantaneous contribution to the Kullback-Leibler (KL) divergence between the safe and original distributions. The framework exploits the noise schedule $g(t)$ : when noise is high, control is cheap; when noise is low, the tube is already tight, requiring minimal correction.

3. Key Contributions

Provably Safe Sampling: The first framework to provide formal, deterministic guarantees that the final output of a pre-trained flow-based model lies within a specified safe set, without retraining.
Cooperative Guidance: A novel "constricting safety tube" that aligns with the coarse-to-fine generation process, concentrating constraint enforcement in the high-noise regime to minimize semantic distortion.
Modularity: The method is a plug-and-play module applicable to any pre-trained flow-based model (Diffusion, Flow Matching) requiring no architectural changes.
Optimal Control Formulation: Demonstrates that greedy minimum-norm control at each step provides the tightest bound on distributional shift (KL divergence) for the given noise schedule.

4. Experimental Results

The framework was validated across three distinct domains using off-the-shelf pre-trained models:

Physics-Consistent Trajectory Generation (Lorenz System):
- Task: Generate trajectories that strictly obey the Lorenz differential equations.
- Result: Unconstrained diffusion models produced statistically plausible but physically incorrect trajectories. The CBF-guided model achieved 100% adherence to the physical laws while maintaining the correct chaotic attractor structure. Control effort was front-loaded (high at the start) and decayed to near zero as the trajectory converged.
Constrained Image Generation (DDPM):
- Task: Generate bedroom images with specific pixel-level constraints (e.g., a specific window patch or color intensity in a region).
- Result: The method achieved 100% constraint satisfaction (exact pixel matching in constrained regions).
- Comparison: Unlike projection-based methods (Zampini et al., 2025), which caused "black-tape" artifacts and destroyed semantic coherence, the CBF approach preserved realistic textures and lighting, proving that constraints can be enforced without sacrificing image quality.
Safe Robotic Manipulation (Diffusion Policy):
- Task: Generate smooth action sequences for a "Push-T" robot task, bounding the "jerk" (rate of change of acceleration).
- Result: Unconstrained policies violated smoothness constraints frequently (12–16 times per episode). The CBF-guided policy achieved zero violations while maintaining the same task reward (0.92).
- Efficiency: The computational overhead was modest (~34% increase in inference time), remaining within real-time control loop requirements.

5. Significance and Impact

This paper bridges a critical gap between the expressive power of generative AI and the rigorous safety requirements of real-world systems.

Safety-Critical Deployment: It enables the use of powerful generative models in robotics and autonomous systems where failure is not an option, providing deterministic safety certificates rather than probabilistic hopes.
Preservation of Fidelity: By minimizing distributional shift, it ensures that safety enforcement does not degrade the quality or semantic meaning of the generated content.
Generalizability: The framework is model-agnostic and constraint-agnostic (as long as a differentiable barrier function can be defined), making it a versatile tool for future safety-critical AI applications.

Limitations & Future Work: The current method requires a continuously differentiable barrier function, which can be difficult to construct for ambiguous semantic constraints (e.g., "no offensive content"). Future work aims to extend this to latent diffusion models and incorporate learned dynamics models for state-space safety (e.g., collision avoidance).