Robust Fine-Tuning from Non-Robust Pretrained Models: Mitigating Suboptimal Transfer With Epsilon-Scheduling

Imagine you have a master chef who has spent years learning to cook a wide variety of dishes (this is your pre-trained model). They are incredibly talented at recognizing ingredients and flavors, but they've never been trained to cook in a chaotic kitchen where someone is constantly throwing flour, water, or spices at them to ruin the dish (this is the adversarial attack).

Now, you want to hire this chef to cook a very specific new dish for a VIP client (this is fine-tuning). You also want them to be able to cook this dish even while the kitchen is being sabotaged (this is Robust Fine-Tuning).

The problem is, if you immediately start training the chef while the saboteur is throwing flour everywhere, the chef gets confused. They forget how to cook the dish at all, and the result is a disaster. This paper calls this "Suboptimal Transfer."

Here is the breakdown of the paper's discovery and solution, explained simply:

1. The Problem: "The Flooding Kitchen"

The researchers found that when you try to teach a non-robust model to be robust immediately, it backfires.

The Analogy: Imagine trying to teach a student to ride a bike while someone is constantly pushing them over. If you start pushing them over on Day 1, the student never learns to balance. They just fall down, get frustrated, and give up.
The Reality: In machine learning, if you start training a model with strong "attacks" (perturbations) right away, the model gets so confused that it forgets the original task. It ends up performing worse than if you had just trained it normally without any attacks. The model fails to adapt to the new job because it's too busy fighting the "noise."

2. The Discovery: "The Delayed Start"

The researchers noticed something interesting about why this happens.

Normal Training: The model starts learning the new task immediately.
Robust Training (The Old Way): The model spends the first few weeks (or "epochs") just trying to survive the attacks. It doesn't actually learn the new task until much later. By the time it finally starts learning, the training time is almost over, so it never gets good at the job.

3. The Solution: "Epsilon-Scheduling" (The Gradual Ramp-Up)

The authors propose a clever new strategy called Epsilon-Scheduling. Instead of throwing the flour immediately, you introduce it slowly.

The Analogy: Think of it like training for a marathon.
- Weeks 1-2: You just walk and jog. No weights, no hills. You build your base fitness (this is Standard Fine-Tuning).
- Weeks 3-6: You start adding small hills and light weights. You get used to the difficulty gradually (this is the Linear Increase).
- Weeks 7-10: You are now running with full gear and on steep hills (this is the Target Robustness).
The Result: Because the model learned the basics before the chaos started, it doesn't get confused. It builds a strong foundation, and then learns to handle the attacks. The model ends up being both good at the task and resistant to attacks.

4. The New Scorecard: "Expected Robustness"

The paper also suggests we need a better way to grade these models.

The Old Way: We usually only check two things: "How good is it on a clean day?" and "How good is it on the worst possible attack day?"
The New Way (Expected Robustness): The authors say, "What about the days in between?" Maybe the model is great on a clean day, terrible on a hurricane, but okay on a light rain.
The Analogy: Imagine grading a car.
- Old Grade: "Can it drive on a highway? Yes. Can it drive in a tornado? No."
- New Grade: "How well does it drive on average conditions, from a sunny day to a storm?"
- This new metric gives a more complete picture of how reliable the model actually is in the real world.

Summary of Results

The researchers tested this "Gradual Ramp-Up" method on six different types of AI models and five different tasks (like identifying dog breeds or car models).

Without the schedule: The models often failed completely, performing worse than random guessing on difficult tasks.
With the schedule (Epsilon-Scheduling): The models learned the task quickly, stayed robust against attacks, and achieved a much better balance between being smart and being tough.

In a nutshell: Don't throw the baby (the model) into the deep end of the pool (adversarial attacks) immediately. Let them learn to swim in the shallow end first, then slowly move them deeper. This simple change saves the model from drowning and helps it become a champion swimmer.

1. Problem Statement

The paper addresses a critical gap in Robust Fine-Tuning (RFT): the assumption that robust pre-trained backbones are necessary for successful downstream robustness. In practice, most widely available pre-trained models (e.g., from Hugging Face) are non-robust (trained only on clean data).

When researchers attempt to robustly fine-tune these non-robust models using standard Adversarial Training (AT) objectives (specifically minimizing the worst-case loss at a fixed perturbation strength $\epsilon_g$ ), they encounter a phenomenon the authors call Suboptimal Transfer.

The Issue: Even with small perturbation strengths, standard RFT often causes the model's clean accuracy to collapse significantly compared to standard fine-tuning. In severe cases, performance drops to near-random levels, rendering the pre-trained initialization useless.
The Cause: The authors identify that applying a robust objective immediately at the start of training delays task adaptation. The model struggles to learn the downstream task features because the adversarial perturbations distort the pre-trained representations before the model can adapt to the new domain.

2. Methodology

A. Epsilon-Scheduling (The Proposed Solution)

To mitigate suboptimal transfer, the authors propose Epsilon-Scheduling, a heuristic perturbation schedule that dynamically adjusts the adversarial perturbation strength ( $\epsilon$ ) during training.

Mechanism: Instead of fixing $\epsilon = \epsilon_g$ $ϵ = ϵ_{g}$ throughout training (the baseline RFT-fix), the schedule follows a two-hinge linear function:
1. Phase 1 (Adaptation): For the first $T_1$ epochs, $\epsilon = 0$ . The model performs standard fine-tuning to adapt to the downstream task without robustness constraints.
2. Phase 2 (Transition): From epoch $T_1$ to $T_2$ , $\epsilon$ increases linearly from $0$ to the target $\epsilon_g$ .
3. Phase 3 (Robustness): From epoch $T_2$ onwards, $\epsilon$ remains fixed at $\epsilon_g$ .
Rationale: This acts as a curriculum learning strategy. It allows the model to first establish a strong representation for the downstream task (minimizing clean loss) before gradually introducing the robustness constraint, preventing the "feature distortion" that causes adaptation delays.

B. Expected Robustness (New Evaluation Metric)

The paper argues that standard evaluation (Clean Accuracy vs. Robust Accuracy at a single $\epsilon_g$ ) is insufficient. They introduce Expected Robustness:

Definition: The expected accuracy of the model over the entire range of perturbations from $0$ to $\epsilon_g$ , assuming a uniform distribution of perturbation strengths.
Formula: $\text{Acc}_{[0, \epsilon_g]}(f) = \frac{1}{\epsilon_g} \int_{0}^{\epsilon_g} \text{Acc}_{\epsilon}(f) d\epsilon$ .
Significance: This metric captures the full accuracy-robustness trade-off curve (Area Under the Curve), providing a more comprehensive view of model performance under realistic threat models where perturbations may vary in magnitude.

3. Key Contributions

Identification of Suboptimal Transfer: The paper systematically demonstrates that robust fine-tuning from non-robust backbones often fails, leading to severe clean accuracy degradation, even at moderate perturbation levels ( $\epsilon_g = 4/255$ ).
Diagnosis of Delayed Adaptation: Through empirical analysis, the authors show a strong negative correlation between the delay in task adaptation (the number of epochs before validation accuracy rises) and the severity of suboptimal transfer. Standard RFT causes this delay; Epsilon-Scheduling eliminates it.
Epsilon-Scheduling: A simple, effective two-hinge linear schedule that allows models to first adapt to the task and then learn robustness, successfully preventing suboptimal transfer.
Expected Robustness Metric: A new evaluation standard that quantifies performance across a spectrum of perturbations, revealing that Epsilon-Scheduling often yields better overall utility than standard RFT, even if worst-case robustness is slightly lower.

4. Experimental Results

The authors evaluated their method across 6 pre-trained models (ViT, Swin, ConvNeXt, ResNet-50, CLIP-ViT, CLIP-ConvNeXt) and 5 datasets (Caltech, CUB, Cars, Dogs, Aircraft) under two perturbation regimes: moderate ( $\epsilon_g = 4/255$ ) and high ( $\epsilon_g = 8/255$ ).

Mitigation of Failure:
- Standard RFT (fix): Frequently resulted in suboptimal transfer. For example, on the Aircraft dataset with ViT, clean accuracy dropped to ~6% at $\epsilon_g=4/255$ (compared to ~58% with standard fine-tuning).
- Epsilon-Scheduling (sched): Consistently maintained high clean accuracy (often matching or exceeding standard fine-tuning) while achieving significant robustness.
Performance Gains:
- At $\epsilon_g = 4/255$ , Epsilon-Scheduling improved Expected Robustness in all configurations compared to fix.
- At $\epsilon_g = 8/255$ , standard RFT almost universally failed (clean accuracy near 0-5%), whereas Epsilon-Scheduling preserved clean accuracy (e.g., ~69% on Aircraft with Swin) and achieved robustness.
Robust Backbones: Even when applied to pre-trained robust backbones, Epsilon-Scheduling improved clean accuracy, though it sometimes slightly reduced worst-case robustness, still resulting in a net gain in Expected Robustness.
Optimization Dynamics: Analysis of loss landscapes showed that Epsilon-Scheduling converges to a different local minimum than standard RFT, characterized by lower clean loss and comparable adversarial loss, effectively balancing the conflicting objectives.

5. Significance and Impact

Bridging the Research-Deployment Gap: Since most accessible pre-trained models are non-robust, this work provides a practical, necessary strategy to make them robust for downstream tasks without requiring costly robust pre-training.
Paradigm Shift in RFT: It challenges the prevailing assumption that robust pre-training is a prerequisite for robust fine-tuning. The paper proves that the schedule of perturbation introduction is more critical than the initial robustness of the backbone.
Evaluation Standard: The introduction of Expected Robustness encourages the community to move beyond single-point evaluation, offering a more realistic metric for safety-critical applications where threats vary in intensity.
Generalizability: The method is architecture-agnostic (works on CNNs and Transformers) and task-agnostic, making it a versatile tool for the broader machine learning community.

In conclusion, the paper demonstrates that when, not just how, we apply adversarial training during fine-tuning is the key to unlocking robustness in non-robust pre-trained models. Epsilon-Scheduling offers a simple, high-impact solution to a previously overlooked failure mode in transfer learning.

Robust Fine-Tuning from Non-Robust Pretrained Models: Mitigating Suboptimal Transfer With Epsilon-Scheduling

1. The Problem: "The Flooding Kitchen"

2. The Discovery: "The Delayed Start"

3. The Solution: "Epsilon-Scheduling" (The Gradual Ramp-Up)

4. The New Scorecard: "Expected Robustness"

Summary of Results

1. Problem Statement

2. Methodology

A. Epsilon-Scheduling (The Proposed Solution)

B. Expected Robustness (New Evaluation Metric)

3. Key Contributions

4. Experimental Results

5. Significance and Impact

More like this

Interpretable Tau-PET Synthesis from Multimodal T1-Weighted and FLAIR MRI Using Partial Information Decomposition Guided Disentangled Quantized Half-UNet

SUPERGLASSES: Benchmarking Vision Language Models as Intelligent Agents for AI Smart Glasses

MultiModalPFN: Extending Prior-Data Fitted Networks for Multimodal Tabular Learning

"Don't Do That!": Guiding Embodied Systems through Large Language Model-based Constraint Generation

OpenGLT: A Comprehensive Benchmark of Graph Neural Networks for Graph-Level Tasks