Uncertainty-Guided Label Rebalancing for CPS Safety Monitoring

Imagine you are the safety inspector for a fleet of delivery drones. Your job is to watch the live video feeds and sensor data to spot a drone that is about to crash.

Here is the problem: Crashes are incredibly rare. For every 46 flights that go perfectly smoothly, maybe only one flight has a near-miss or a crash.

If you train a computer to learn from this data, the computer gets lazy. It learns that "if you just guess 'Safe' every single time, you'll be right 98% of the time." So, it stops trying to find the crashes. It becomes a terrible safety inspector because it misses the one time it actually matters.

This paper introduces a clever new method called U-Balance to fix this. Here is how it works, using simple analogies:

1. The Problem: The "Boring" Dataset

Imagine you have a giant stack of 100,000 flight logs. 98,000 of them are boring, smooth flights. Only 2,000 are "spicy" flights where the drone swerved, shook, or got confused.

Standard AI: Looks at the stack, sees mostly boring flights, and decides, "I'll just ignore the spicy ones. I'll bet everything is boring."
Old Fix (SMOTE): Some people tried to fix this by photocopying the spicy flights and making fake copies to fill the stack. But this is like photocopying a blurry photo of a crash; the computer just learns to recognize the blur, not the actual danger.

2. The Insight: "Confused" Drones are Dangerous

The authors noticed something interesting about the drones. Even when a drone doesn't crash, sometimes it acts uncertain.

Certain: The drone flies straight and smooth.
Uncertain: The drone wobbles, changes direction quickly, or hesitates. It's like a driver who is unsure of the road; they might be safe, but they are more likely to make a mistake.

The authors realized: Uncertainty is a warning sign. Even if the drone is currently safe, if it's acting "confused," it's a good candidate to be treated as a potential danger for training purposes.

3. The Solution: U-Balance (The "Smart Re-labeler")

Instead of making fake crash data, U-Balance changes the labels of the existing data. It uses a three-step process:

Step A: The "Confusion Detector" (Uncertainty Predictor)

First, they train a special AI (called a GatedMLP) to act like a "vibe checker."

It looks at a short clip of a drone's flight.
It doesn't ask, "Did it crash?"
It asks, "Is this drone acting weird? Is it wobbling? Is it hesitating?"
It gives the flight a "Confusion Score." High score = "This drone is acting nervous."

Step B: The "Label Flipper" (uLNR)

This is the magic trick. They take the flights that are currently labeled "Safe" but have a High Confusion Score.

They say: "Hey, this flight was technically safe, but the drone was acting so nervous that it could have crashed. Let's pretend it was a 'Near-Miss' for training purposes."
They flip the label from "Safe" to "Unsafe" for these specific confusing flights.
Why? This doesn't create fake data. It just tells the computer: "Look closer at these tricky moments. They are the edge cases where safety is most at risk." It enriches the "danger" pile with real, high-quality examples of near-misses.

Step C: The "Safety Teacher" (Safety Predictor)

Finally, they train the main safety AI on this new, "rebalanced" dataset.

Because the dataset now has more examples of "nervous" flights (which are often precursors to crashes), the AI learns to be much more alert.
It stops ignoring the rare crashes because it has been trained to recognize the signs of trouble (the uncertainty) before the crash happens.

The Results: Why It Matters

When they tested this on a massive dataset of real drone flights:

Old methods were like a security guard who sleeps on the job, missing 50% of the threats.
U-Balance woke up the guard. It improved the ability to catch dangerous situations by 14.3% compared to the best existing methods.
It did this without slowing down the system or needing to generate fake data.

The Takeaway

Think of U-Balance as a smart filter. Instead of trying to find a needle in a haystack by making more needles (fake data), it teaches the computer to recognize the shape of the haystack that usually hides the needle. By focusing on the moments of "uncertainty" or "confusion," it turns a boring dataset into a highly effective safety training tool.

It's a novel way to say: "Don't just look for the crash; look for the hesitation."

1. Problem Statement

Context: Cyber-Physical Systems (CPS), such as Unmanned Aerial Vehicles (UAVs), require robust safety monitoring to detect and prevent unsafe behaviors. Data-driven approaches using supervised learning are common but face a critical challenge: extreme class imbalance. In real-world operations, safe events vastly outnumber unsafe events (e.g., a 46:1 ratio in UAV datasets).

Limitations of Current Solutions:

Standard Rebalancing: Techniques like SMOTE (synthetic oversampling) often generate unrealistic synthetic time-series data, while class weighting struggles under extreme imbalance.
Label Noise Rebalancing (LNR): A promising technique that stochastically flips labels of majority-class samples near the decision boundary to the minority class. However, standard LNR was designed for images and does not account for the specific dynamics of CPS time-series data.
Underutilized Information: CPS operations exhibit behavioral uncertainty (e.g., erratic control signals, rapid heading changes) which is correlated with safety outcomes. Current safety predictors often ignore this uncertainty signal or fail to leverage it effectively to address data imbalance.

2. Methodology: U-Balance

The authors propose U-Balance, a supervised learning framework that leverages behavioral uncertainty to rebalance datasets before training a safety predictor. The approach consists of three collaborative components:

A. Uncertainty Predictor

This module quantifies the "behavioral uncertainty" of a specific telemetry window.

Input: Multivariate time-series windows (e.g., heading angle, spatial coordinates $x, y, z$ ).
Preprocessing: Raw time-series are transformed into distributional kinematic features (mean, std, min, max) for each channel, capturing temporal variability and extremes.
Architecture: A novel GatedMLP (Gated Multi-Layer Perceptron). Unlike standard MLPs, it uses a gating mechanism (inspired by GRUs) to dynamically suppress uninformative features and emphasize relevant ones based on flight conditions.
Output: An uncertainty score ( $\hat{u}_t$ ) for each window.

B. Uncertainty-Guided Label Rebalancing (uLNR)

This is the core innovation, adapting LNR for time-series CPS data using the uncertainty scores.

Uncertainty Scoring: Uncertainty scores for safe windows are converted to Z-scores to identify samples that deviate significantly from typical safe behavior.
Flip-Rate Calculation: A shifted hyperbolic tangent function maps Z-scores to a flip probability. Only safe-labeled windows with unusually high uncertainty (high Z-scores) are eligible for relabeling.
Stochastic Relabeling: Safe windows are probabilistically relabeled as unsafe based on their flip probability.
- Result: The minority class (unsafe) is enriched with "informative boundary samples" (safe windows that look unsafe due to high uncertainty) without synthesizing new data.

C. Safety Predictor

A standard safety classifier (implemented as a Multi-Layer Bi-LSTM) is trained on the rebalanced dataset ( $D_{bal}$ ) generated by the uLNR step to predict runtime safety.

3. Key Contributions

U-Balance Framework: The first approach to exploit behavioral uncertainty specifically for dataset rebalancing in CPS safety monitoring. It moves beyond standard fusion strategies (early/late fusion) to use uncertainty as a structural signal for data augmentation via relabeling.
Novel Uncertainty Predictor: A specialized GatedMLP architecture that summarizes time-series telemetry into distributional features, effectively capturing the nuances of CPS behavioral uncertainty.
Adaptation of LNR: Successfully adapts the Label Noise Rebalancing (LNR) technique from image data to time-series CPS data, demonstrating that stochastic relabeling based on uncertainty is superior to synthetic oversampling.

4. Experimental Results

The approach was evaluated on a large-scale UAV benchmark (Khatiri et al.) with a 46:1 safe-to-unsafe ratio.

Correlation: A moderate but significant correlation ( $r_{pb} \approx 0.44$ ) was confirmed between behavioral uncertainty and safety outcomes.
Integration Strategy: uLNR significantly outperformed direct Early Fusion and Late Fusion strategies.
- uLNR F1 Score: 0.806
- Best Baseline (TimeMoE) F1 Score: 0.663
- Improvement: U-Balance outperformed the strongest baseline by 14.3 percentage points.
Comparison with Rebalancing Techniques: uLNR outperformed 14 other state-of-the-art rebalancing methods (including SMOTE, T-SMOTE, ADASYN, and Class Weighting) by margins of 13.1 to 26.6 percentage points in F1 score.
Efficiency: U-Balance maintains competitive inference latency (0.0045s per sample), comparable to other deep learning baselines, and is significantly faster than unsupervised autoencoder-based methods.
Ablation Studies:
- Removing the GatedMLP (replacing with plain MLP) reduced F1 by ~7.4 points.
- Removing distributional preprocessing (using raw sequences) reduced F1 by ~10 points.
- Replacing uLNR with standard rebalancing methods caused a drop of 13.1–26.6 points.

5. Significance

Novel Paradigm: The paper demonstrates that uncertainty is not just a feature to be fused but a powerful signal for reshaping the training distribution. By relabeling high-uncertainty safe samples as unsafe, the model learns to recognize the "edge cases" that often precede failures.
Practical Impact: For safety-critical CPS, high recall (detecting unsafe states) is crucial. U-Balance achieves a recall of 0.822, whereas most baselines struggle below 0.53, meaning standard models miss nearly half of the unsafe events.
Generalizability: The method is model-agnostic; applying uLNR to other baseline models consistently improved their performance, suggesting the technique is applicable across various CPS domains (e.g., autonomous vehicles, robotics) where uncertainty correlates with safety risks.

In summary, U-Balance provides a highly effective, computationally efficient solution to the extreme class imbalance problem in CPS safety monitoring by intelligently leveraging behavioral uncertainty to enrich the training data with critical boundary samples.