The Stability of Online Algorithms in Performative Prediction

The Big Picture: The "Self-Fulfilling Prophecy" Problem

Imagine you are a weather forecaster.

Scenario A (Normal): You predict rain. People bring umbrellas. It rains. You were right. The world didn't change because of your prediction.
Scenario B (Performative): You predict a massive traffic jam. Because of your prediction, everyone decides to stay home or take a different route. Suddenly, the roads are empty! Your prediction caused the traffic jam not to happen.

This is the core problem the paper addresses: When algorithms make predictions, those predictions often change human behavior, which changes the data the algorithm sees next.

If a bank's AI predicts you are a "high risk" for a loan, you might be denied credit. Without credit, you can't build a good financial history, so you become high risk. The AI's prediction created the very reality it feared. This creates a feedback loop that can spiral out of control, making the system unstable and unpredictable.

The Old Way: Trying to Control the World

For years, researchers tried to fix this by assuming the world is "gentle." They assumed that if an algorithm changes its mind slightly, the world only changes slightly in response. They called this the "Lipschitz condition" (a fancy math way of saying "no sudden jumps").

The Analogy: Imagine trying to balance a broom on your hand.

The Old Assumption: The broom is made of soft rubber. If you tilt it a little, it wobbles a little, and you can easily correct it.
The Reality: In the real world (like medicine, education, or finance), the "broom" is made of glass. If you tilt it even a tiny bit, it might shatter or snap back violently.
The Problem: Recent research showed that if the world reacts violently (discontinuously), it is mathematically impossible to find a single, perfect model that stays stable. It's like trying to find a spot to balance a broom that keeps changing its shape.

The New Discovery: The "Chameleon Strategy"

The authors of this paper (Gabriele Farina and Juan Carlos Perdomo) found a brilliant workaround. They realized you don't need to find one perfect model that never changes. Instead, you should use a mixture (a random blend) of many different models.

The Analogy: The "Blind Taste Test"
Imagine a restaurant trying to find the perfect soup recipe.

The Old Way: The chef picks one recipe, serves it, sees how people react, and tries to tweak that one recipe forever. If the customers' tastes change wildly, the chef gets confused and keeps changing the recipe, never settling.
The New Way: The chef decides to serve a random mix of 100 different recipes every day. Some days it's spicy, some days it's mild.
- The customers' reactions change based on the soup they get.
- But because the chef is averaging over all the recipes, the overall feedback loop stabilizes. No single recipe is "wrong" because the system is designed to handle the chaos by spreading the risk.

The Magic Ingredient: "No-Regret" Algorithms

How do we create this mix? The paper uses a concept from online learning called "No-Regret."

Think of a gambler at a casino.

A "No-Regret" gambler doesn't try to predict the future perfectly. Instead, they just make sure that over time, they didn't miss out on a better strategy they could have used.
If they played the game 1,000 times, their total winnings are almost as good as if they had known the best move for every single hand in hindsight.

The paper proves a magical connection: If you use a "No-Regret" algorithm to update your models over time, and then you take a random sample from all the models you've ever created, that mix is guaranteed to be stable.

It doesn't matter if the world is chaotic, if the data jumps around, or if the rules change suddenly. As long as the algorithm is "smart enough" to minimize its regret over time, the resulting mix of models will stop the runaway feedback loop.

Why This Matters (The "Aha!" Moment)

It Works Everywhere: The old methods failed if the data was "jumpy" (like a student passing or failing a class based on a strict cutoff score). This new method works even with those jumpy, discontinuous rules.
It Explains Why We Don't Crash: You might wonder, "Why don't our current AI systems (like recommendation engines) crash the economy or society?" The paper suggests it's because these systems naturally act like "No-Regret" algorithms. They constantly tweak and retrain, and even though they are chasing a moving target, the average of their behavior naturally settles into a stable state.
No Magic Assumptions Needed: We don't need to assume the world is nice and smooth. We just need the algorithm to keep learning and adapting.

Summary in One Sentence

Instead of trying to find one perfect, unchanging crystal ball that predicts the future (which is impossible when the future changes because of the prediction), we should use a "No-Regret" learning process that constantly adapts; the average of all the models it creates along the way will naturally stabilize the system, preventing runaway feedback loops even in chaotic environments.

1. Problem Statement

The paper addresses the challenge of Performative Prediction, a framework where the deployment of a predictive model actively influences the data distribution it will later encounter.

The Feedback Loop: When a model $\theta$ is deployed, it induces a distribution $D(\theta)$ . The learner then observes data $z \sim D(\theta)$ , which is used to update the model. This creates a dynamic feedback loop where the model shapes the data, and the data shapes the model.
The Goal: The objective is to find a Performatively Stable solution. A model (or mixture of models) is stable if, when deployed, the data it induces makes the model's predictions optimal in hindsight. Formally, a mixture $\mu$ is stable if:
$\mathbb{E}_{\theta \sim \mu} \mathbb{E}_{z \sim D(\theta)} [\ell(z; \theta)] \leq \min_{\theta'} \mathbb{E}_{\theta \sim \mu} \mathbb{E}_{z \sim D(\theta)} [\ell(z; \theta')] + \epsilon$
Limitations of Prior Work: Previous positive results relied on strong assumptions:
1. Lipschitz Continuity: The distribution map $D(\cdot)$ must be Lipschitz continuous (small changes in $\theta$ lead to small changes in $D(\theta)$ ). This fails in real-world scenarios involving thresholding (e.g., credit approval cutoffs, medical interventions).
2. Convexity/Smoothness: Loss functions were often required to be strongly convex and smooth.
3. Contraction: The feedback loop was assumed to be a contraction ( $\rho < 1$ ).
4. Hardness: Recent results (Anagnostides et al., 2026) showed that finding a single stable model is PPAD-complete even under these favorable assumptions, and stable models may not exist if $D(\cdot)$ is discontinuous.

2. Methodology

The authors propose a novel approach that shifts the focus from finding a single deterministic stable model to finding a stable mixture of models.

Core Insight: They establish an unconditional reduction from No-Regret Online Learning to Performative Stability.
The Mechanism:
1. Consider an online learning algorithm that selects a sequence of models $\theta_1, \dots, \theta_T$ .
2. At each step $t$ , the learner deploys $\theta_t$ , observes a sample $z_t \sim D(\theta_t)$ , and incurs a loss $\ell(z_t, \theta_t)$ .
3. The algorithm updates to $\theta_{t+1}$ based on this loss.
4. Instead of deploying the final iterate $\theta_T$ , the learner deploys a uniform mixture $\mu$ over the entire sequence $\{\theta_1, \dots, \theta_T\}$ .
Theoretical Tool: The proof utilizes a martingale argument.
- Standard online-to-batch conversions assume samples are drawn from a fixed distribution. Here, the distribution $D(\theta_t)$ changes with the learner's choice.
- The authors show that the difference between the expected loss over the induced distribution and the observed loss forms a martingale difference sequence.
- Because the online algorithm guarantees low regret against adaptive adversaries (where the loss function can depend on the previous action), the cumulative regret bounds the deviation from stability.

3. Key Contributions

Unconditional Reduction: The paper proves that any no-regret algorithm deployed in a performative setting converges to a performatively stable mixture. This holds without any assumptions on the distribution map $D(\cdot)$ (no Lipschitz continuity required) or the loss function $\ell$ (no strong convexity or smoothness required).
Sidestepping Hardness Results: By allowing randomization (mixtures), the authors bypass the PPAD-completeness results that prevent finding stable single models in discontinuous settings.
Generalization of Existing Algorithms: The work provides a unified theoretical justification for why common algorithms (Gradient Descent, Follow-the-Leader, Retraining) naturally stabilize in dynamic environments, even in regimes previously thought to be unstable.
New Convergence Rates: The paper establishes the first stability guarantees for:
- Weakly convex or non-smooth losses.
- Arbitrary (potentially discontinuous) distribution maps.
- Finite-sample settings with improved convergence rates ( $O(1/T)$ or $O(\log T / T)$ ) for specific loss types (e.g., exp-concave losses like log-loss).

4. Key Results

The main theorem (Theorem 3) states that if an online algorithm achieves sublinear regret $Regret(T)$ on the sequence of stochastic losses $\ell_t(\theta) = \ell(z_t, \theta)$ where $z_t \sim D(\theta_t)$ , then the uniform mixture $\mu$ over the iterates is $\frac{Regret(T)}{T}$ -performatively stable.

Specific corollaries include:

Retraining (Follow-the-Leader): For $\gamma$ -strongly convex losses, repeated retraining yields a mixture with stability error $O(\frac{\log T}{T})$ . This is the first proof of finite-sample stability for retraining without Lipschitz assumptions on $D(\cdot)$ .
Gradient Descent:
- For convex losses: Stability error $O(\frac{1}{\sqrt{T}})$ .
- For strongly convex losses: Stability error $O(\frac{\log T}{T})$ .
- Crucially, these hold for non-smooth losses and arbitrary $D(\cdot)$ .
Exp-Concave Losses (e.g., Log-Loss, Squared Loss): Using the Online Newton Step algorithm, the paper achieves $O(\frac{\log T}{T})$ stability rates for standard machine learning losses, which were previously difficult to analyze in performative settings without strong convexity.

5. Significance and Implications

Theoretical Breakthrough: This work fundamentally changes the understanding of performative prediction. It demonstrates that the "runaway feedback loops" feared in dynamic social systems are naturally prevented by standard no-regret learning dynamics, provided the learner is willing to randomize over their history of models.
Practical Relevance:
- It validates the use of standard retraining pipelines in high-stakes domains (medicine, education, finance) where data distributions shift discontinuously due to human reaction to thresholds.
- It explains why gradient descent often works in practice despite theoretical instability in discontinuous settings: the algorithm implicitly explores a trajectory that averages out to a stable state.
Future Directions: The paper opens new avenues for research at the intersection of online optimization and performativity, including:
- Extending these results to multi-player settings (where multiple agents influence the data).
- Investigating stateful settings (where data depends on the entire history of models).
- Exploring conditions under which stable mixtures also achieve performative optimality (minimizing the true performative risk), rather than just stability.

In summary, Farina and Perdomo provide a robust, assumption-free theoretical framework showing that randomization over the trajectory of no-regret algorithms is the key to achieving stability in performative prediction, effectively solving the convergence problem for a vast class of previously intractable scenarios.

The Stability of Online Algorithms in Performative Prediction

The Big Picture: The "Self-Fulfilling Prophecy" Problem

The Old Way: Trying to Control the World

The New Discovery: The "Chameleon Strategy"

The Magic Ingredient: "No-Regret" Algorithms

Why This Matters (The "Aha!" Moment)

Summary in One Sentence

1. Problem Statement

2. Methodology

3. Key Contributions

4. Key Results

5. Significance and Implications

More like this

NS-RGS: Newton-Schulz based Riemannian gradient method for orthogonal group synchronization

Poisson-response Tensor-on-Tensor Regression and Applications

Virtual Dummies: Enabling Scalable FDR-Controlled Variable Selection via Sequential Sampling of Null Features

Eliciting core spatial association from spatial time series: a random matrix approach

Regularized estimation for highly multivariate spatial Gaussian random fields