Co-optimization for Adaptive Conformal Prediction

This paper proposes CoCP, a novel framework that jointly optimizes prediction interval centers and radii through an alternating algorithm combining quantile regression and a differentiable soft-coverage objective, thereby achieving finite-sample marginal validity while significantly improving interval efficiency and conditional coverage under heteroscedasticity and skewness compared to existing methods.

Xiaoyi Su, Zhixin Zhou, Rui Luo

Published 2026-03-03
📖 5 min read🧠 Deep dive

Imagine you are trying to predict the weather for tomorrow. You want to give your friends a forecast that is reliable (it actually rains when you say it will) but also useful (you don't just say "it might rain or snow or be sunny," you want to give a specific range).

In the world of data science, this is called Conformal Prediction. It's a method to draw a "safety net" (a prediction interval) around a guess. If you say, "The temperature will be between 60°F and 80°F," you want to be 90% sure the real temperature falls in that range.

The Problem: The "Equal-Tailed" Mistake

Most current methods use a simple, rigid rule to draw this safety net. They assume the weather is symmetrical, like a bell curve. They say, "Okay, I'll cut off the bottom 5% of possibilities and the top 5%."

The Analogy: Imagine you are trying to fit a suitcase into a car trunk.

  • The Old Way (CQR): You assume the trunk is a perfect rectangle. You measure 5 inches from the left wall and 5 inches from the right wall, then close the lid.
  • The Reality: The trunk is actually shaped weirdly. Maybe the left side is deep and full of space, but the right side is squished by the spare tire.
  • The Result: By measuring equally from both sides, your suitcase (the prediction interval) ends up being way too big because you're including a lot of empty space on the left just to balance the tight squeeze on the right. You are safe, but you are wasting space.

This happens when data is "skewed" (lopsided). The old methods force the interval to be centered on the average, even if the "crowded" part of the data (where the truth is most likely to be) is off to one side.

The Solution: CoCP (The "Smart Suitcase")

The authors of this paper propose a new method called CoCP (Co-optimization for Adaptive Conformal Prediction). Instead of using a rigid ruler, CoCP acts like a smart, shape-shifting suitcase that learns the exact shape of the trunk.

Here is how it works, using a simple metaphor:

1. The "Folded Flag" Trick

Imagine the data distribution is a flag hanging on a pole.

  • The Old Way: You try to grab the flag from the left and right edges equally.
  • CoCP's Way: It takes the flag, folds it in half over the pole (the center), and looks at the combined thickness. It realizes, "Hey, the left side of the flag is much thicker (denser) than the right side."

2. The "Push and Pull" Dance

CoCP doesn't just guess the center; it learns it through a two-step dance:

  • Step A (The Radius): It asks, "How wide do I need to be to catch 90% of the flag?" It measures the folded flag.
  • Step B (The Center): It looks at the edges of its current guess. If the left edge is in a "thick" part of the flag and the right edge is in a "thin" part, CoCP says, "I'm off-center! I need to push my center toward the thick part."

Why? Because if you move the center toward the thick part, you can shrink the width of the suitcase while still catching the same 90% of the flag. You are squeezing out the empty space.

3. The "Soft Touch"

How does it know where the "thick" part is without seeing the whole flag? It uses a "soft window." Imagine a flashlight that only shines brightly on the very edges of your suitcase. If the light hits a dense crowd of people on the left edge, the flashlight pushes the suitcase to the right. If it hits a sparse crowd on the right, it pulls the suitcase to the left. It's a gentle, continuous nudge until the suitcase is perfectly centered on the crowd.

The Result: Shorter, Smarter Intervals

By doing this "co-optimization" (learning the center and the width at the same time), CoCP achieves two things:

  1. It stays safe: It still guarantees that 90% of the time, the truth is inside the box (just like the old methods).
  2. It gets tighter: Because it moves the box to the "high-density" area, the box doesn't need to be as wide.

In everyday terms:
If you are predicting house prices in a city where most houses are cheap, but a few are mansions:

  • Old Method: "The price will be between $100k and $1M." (Safe, but the $1M part is mostly empty space).
  • CoCP: "The price will be between $150k and $400k." (Still 90% safe, but much more useful because it focuses on where the houses actually are).

Why This Matters

This paper shows that by treating the prediction interval as a flexible object that can slide (translate) and stretch (scale) simultaneously, we can get much better predictions, especially when the data is messy, lopsided, or unpredictable. It's the difference between using a generic, one-size-fits-all box and a custom-molded container that fits the data perfectly.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →