Robust Fine-Tuning from Non-Robust Pretrained Models: Mitigating Suboptimal Transfer With Epsilon-Scheduling

This paper identifies "suboptimal transfer" as a failure mode when robustly fine-tuning non-robust pretrained models and proposes "Epsilon-Scheduling," a novel perturbation strength schedule that mitigates this issue to consistently improve expected robustness across diverse configurations.

Jonas Ngnawé, Maxime Heuillet, Sabyasachi Sahoo, Yann Pequignot, Ola Ahmad, Audrey Durand, Frédéric Precioso, Christian Gagné

Published 2026-03-16
📖 4 min read☕ Coffee break read

Imagine you have a master chef who has spent years learning to cook a wide variety of dishes (this is your pre-trained model). They are incredibly talented at recognizing ingredients and flavors, but they've never been trained to cook in a chaotic kitchen where someone is constantly throwing flour, water, or spices at them to ruin the dish (this is the adversarial attack).

Now, you want to hire this chef to cook a very specific new dish for a VIP client (this is fine-tuning). You also want them to be able to cook this dish even while the kitchen is being sabotaged (this is Robust Fine-Tuning).

The problem is, if you immediately start training the chef while the saboteur is throwing flour everywhere, the chef gets confused. They forget how to cook the dish at all, and the result is a disaster. This paper calls this "Suboptimal Transfer."

Here is the breakdown of the paper's discovery and solution, explained simply:

1. The Problem: "The Flooding Kitchen"

The researchers found that when you try to teach a non-robust model to be robust immediately, it backfires.

  • The Analogy: Imagine trying to teach a student to ride a bike while someone is constantly pushing them over. If you start pushing them over on Day 1, the student never learns to balance. They just fall down, get frustrated, and give up.
  • The Reality: In machine learning, if you start training a model with strong "attacks" (perturbations) right away, the model gets so confused that it forgets the original task. It ends up performing worse than if you had just trained it normally without any attacks. The model fails to adapt to the new job because it's too busy fighting the "noise."

2. The Discovery: "The Delayed Start"

The researchers noticed something interesting about why this happens.

  • Normal Training: The model starts learning the new task immediately.
  • Robust Training (The Old Way): The model spends the first few weeks (or "epochs") just trying to survive the attacks. It doesn't actually learn the new task until much later. By the time it finally starts learning, the training time is almost over, so it never gets good at the job.

3. The Solution: "Epsilon-Scheduling" (The Gradual Ramp-Up)

The authors propose a clever new strategy called Epsilon-Scheduling. Instead of throwing the flour immediately, you introduce it slowly.

  • The Analogy: Think of it like training for a marathon.
    • Weeks 1-2: You just walk and jog. No weights, no hills. You build your base fitness (this is Standard Fine-Tuning).
    • Weeks 3-6: You start adding small hills and light weights. You get used to the difficulty gradually (this is the Linear Increase).
    • Weeks 7-10: You are now running with full gear and on steep hills (this is the Target Robustness).
  • The Result: Because the model learned the basics before the chaos started, it doesn't get confused. It builds a strong foundation, and then learns to handle the attacks. The model ends up being both good at the task and resistant to attacks.

4. The New Scorecard: "Expected Robustness"

The paper also suggests we need a better way to grade these models.

  • The Old Way: We usually only check two things: "How good is it on a clean day?" and "How good is it on the worst possible attack day?"
  • The New Way (Expected Robustness): The authors say, "What about the days in between?" Maybe the model is great on a clean day, terrible on a hurricane, but okay on a light rain.
  • The Analogy: Imagine grading a car.
    • Old Grade: "Can it drive on a highway? Yes. Can it drive in a tornado? No."
    • New Grade: "How well does it drive on average conditions, from a sunny day to a storm?"
    • This new metric gives a more complete picture of how reliable the model actually is in the real world.

Summary of Results

The researchers tested this "Gradual Ramp-Up" method on six different types of AI models and five different tasks (like identifying dog breeds or car models).

  • Without the schedule: The models often failed completely, performing worse than random guessing on difficult tasks.
  • With the schedule (Epsilon-Scheduling): The models learned the task quickly, stayed robust against attacks, and achieved a much better balance between being smart and being tough.

In a nutshell: Don't throw the baby (the model) into the deep end of the pool (adversarial attacks) immediately. Let them learn to swim in the shallow end first, then slowly move them deeper. This simple change saves the model from drowning and helps it become a champion swimmer.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →