CREPE: Controlling Diffusion with Replica Exchange

This paper introduces CREPE, a flexible inference-time control method for diffusion models based on replica exchange that sequentially generates diverse samples and supports online refinement, offering a competitive alternative to Sequential Monte Carlo approaches without requiring retraining.

Jiajun He, Paul Jeha, Peter Potaptchik, Leo Zhang, José Miguel Hernández-Lobato, Yuanqi Du, Saifuddin Syed, Francisco Vargas

Published 2026-03-04
📖 5 min read🧠 Deep dive

Imagine you have a very talented artist (a Diffusion Model) who can paint beautiful pictures. You give them a prompt like "a yellow taxi," and they start with a canvas full of static noise and slowly refine it into a clear image.

Usually, you just tell them what you want, and they do their best. But sometimes, the artist gets a little confused, or you want to tweak the result after they've started painting without hiring a new artist or retraining them. This is called Inference-Time Control.

The paper introduces a new method called CREPE to help control this process. To understand why CREPE is special, let's look at how the old way worked and why it was clunky.

The Old Way: The "Group Hike" (SMC)

Imagine you want to find the best view of a mountain. The old method, called Sequential Monte Carlo (SMC), is like sending a huge group of 1,000 hikers up the mountain all at once.

  • How it works: They all start at the bottom (noise) and hike up together. Every few steps, the group leader looks around and says, "Okay, hikers 1 through 500, you're going the wrong way; go back and copy hikers 501 through 1000 who are on the right path."
  • The Problem: This is called "resampling." Eventually, almost everyone in the group ends up copying the same few hikers. The group loses its diversity. If you want 1,000 different views, you might end up with 1,000 copies of the exact same view. Also, if you realize halfway up that you wanted to see a specific flower patch, you can't just tell the group to change course; you have to send the whole group back down and start over.

The New Way: The "Parallel Tempering" (CREPE)

The authors propose CREPE (Controlling with REPlica Exchange). Think of this not as a group hike, but as a team of explorers on a ladder.

Imagine a ladder with 50 rungs.

  • The Setup: Instead of sending 1,000 people up one ladder, you send one person to stand on each of the 50 rungs.
    • Person A is at the bottom (very noisy, very blurry).
    • Person B is a bit higher (less noisy).
    • ...
    • Person Z is at the top (almost a clear image).
  • The Magic Move (Replica Exchange): Every few minutes, the people on adjacent rungs (say, Person A and Person B) have a conversation. They ask, "Hey, if I swapped places with you, would I have a better view?"
    • If the swap makes sense (mathematically speaking), they swap places.
    • The person who was at the bottom moves up, and the person who was higher moves down.
  • The Result:
    1. Diversity: Because everyone is constantly swapping and moving up and down the ladder, you get a huge variety of different paths. You don't end up with 1,000 clones; you get 50 unique explorers who have all seen different parts of the mountain.
    2. Flexibility: If you suddenly decide, "Actually, I want to see the flower patch," you can just whisper a new instruction to the people on the ladder. They can adjust their path right now without restarting the whole hike.
    3. Efficiency: You don't need a massive army of 1,000 people. You just need a few people on a ladder, and they do the work sequentially (one after another) but in parallel (all at once on different rungs).

What Can CREPE Do?

The paper shows CREPE working on several cool tasks:

  1. Temperature Control (Tempering): Imagine you have a photo of a hot summer day, but you want to "cool it down" to look like a winter scene. CREPE helps the model smoothly transition between these states without getting stuck.
  2. Reward Tilting: Imagine you tell the artist, "Make it a yellow taxi, but make it look really cool and shiny." CREPE guides the painting process to prioritize that "coolness" reward, ensuring the final image matches your specific desire.
  3. Mixing Models (Composition): Imagine you have one artist who is great at drawing cars, and another who is great at drawing backgrounds. CREPE can stitch their work together to create a car in a background, even if they were never trained to work together.
  4. Fixing Bias: Sometimes, standard AI guidance makes images look too similar or "stale." CREPE acts like a diversity coach, ensuring the final batch of images is varied and interesting.

The Catch (The "Burn-in")

Just like a new employee needs time to get used to the job, CREPE needs a "burn-in" period. The first few images it generates might be a bit messy as the "explorers" on the ladder find their footing. But once they settle in, the quality and diversity are top-notch.

The Bottom Line

CREPE is a smarter, more flexible way to steer AI image generators. Instead of forcing a massive group to march in lockstep (which leads to boring, repetitive results), it uses a clever "ladder-swapping" technique to keep the generation process diverse, adaptable, and high-quality. It's like upgrading from a rigid marching band to a jazz band that can improvise and change the song on the fly.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →