Dual-Solver: A Generalized ODE Solver for Diffusion Models with Dual Prediction

Dual-Solver is a generalized ODE solver for diffusion models that employs learnable parameters to dynamically interpolate prediction types, select integration domains, and adjust residuals, thereby significantly improving image quality and CLIP scores in low-function-evaluation regimes across various backbones.

Soochul Park, Yeon Ju Lee

Published 2026-03-05
📖 5 min read🧠 Deep dive

Imagine you are trying to recreate a beautiful, complex painting, but you only have a few brushstrokes to do it.

In the world of AI art (specifically Diffusion Models), the computer starts with a canvas full of random static (like TV snow) and slowly "denoises" it step-by-step until a clear image appears. The problem is that to get a perfect picture, the computer usually needs to take hundreds of tiny steps. This is slow and expensive, like trying to walk across a room by taking one-inch steps.

To fix this, researchers have tried to teach the computer to take bigger, smarter steps (fewer steps, or "NFEs"). However, existing methods are like rigid rulebooks: they force the computer to take steps in a specific way (e.g., "always look at the noise," or "always look at the data"). If the rulebook doesn't match the specific painting style, the result looks blurry or weird.

Enter Dual-Solver, the new "smart navigator" introduced in this paper.

The Core Idea: The "Swiss Army Knife" Step

Think of the old methods as a hammer. It's great for nails, but terrible for screws. Dual-Solver is a Swiss Army Knife. It doesn't just have one way to move; it has a set of adjustable tools that change depending on the situation.

The paper introduces three "knobs" (learnable parameters) that the AI learns to turn automatically:

  1. The "Prediction" Knob (γ\gamma):

    • The Problem: Sometimes the AI should guess what the "noise" looks like, sometimes what the "final image" looks like, and sometimes how fast the image is changing (velocity). Old solvers had to pick one and stick with it.
    • The Dual-Solver Fix: This knob lets the AI smoothly blend between these three guesses. It's like a chef who doesn't just use salt or sugar, but knows exactly how much of each to mix for the perfect flavor at every moment.
  2. The "Map" Knob (τ\tau):

    • The Problem: Imagine trying to walk across a field. Sometimes walking in a straight line (linear) is best. Other times, walking in a spiral or following a winding path (logarithmic) gets you there faster. Old solvers were stuck on one type of map.
    • The Dual-Solver Fix: This knob changes the "geometry" of the path. It allows the AI to switch between a straight road and a winding trail, choosing the most efficient route for that specific step.
  3. The "Correction" Knob (κ\kappa):

    • The Problem: Even with a good map, you might still take a wrong turn. You need a way to fix small errors without starting over.
    • The Dual-Solver Fix: This knob adds a tiny "safety net" or a fine-tuning adjustment to the step. It's like a tightrope walker using a balancing pole to make micro-adjustments so they don't fall, ensuring the step stays accurate even when taken quickly.

How Does It Learn? (The "Teacher" vs. The "Judge")

Usually, to teach a student (the solver) to walk faster, you show them a video of a master walker (a high-quality, slow solver) and say, "Copy my steps exactly." This is called Regression.

  • The Issue: This is hard. The student gets confused trying to mimic the exact path, especially when they are only allowed to take 3 or 5 steps.

Dual-Solver uses a clever trick called Classification.

  • The Analogy: Instead of asking the student to copy the master's steps, we give them a Judge (a pre-trained image classifier, like a robot that knows what a "cat" looks like).
  • The AI takes a few steps, generates an image, and asks the Judge: "Does this look like a cat?"
  • If the Judge says "No," the AI knows it went off-track and adjusts its knobs (the Swiss Army Knife tools) to try again.
  • Why it's better: The AI doesn't need to memorize the exact path of a master. It just needs to learn to stay on the "right side of the line" where the Judge says, "Yes, that's a cat!" This allows it to find its own unique, efficient path to a high-quality image.

The Results: Fast and Furious

The researchers tested this on various AI art models (like DiT, SANA, and PixArt).

  • The Old Way: To get a good picture, you might need 20–50 steps.
  • Dual-Solver: Can get a picture that is just as good (or better) in only 3 to 9 steps.

Summary

Dual-Solver is like upgrading a car from a vehicle with a fixed gear ratio to one with a continuously variable transmission (CVT) and a GPS that learns from a traffic judge.

  • It doesn't force the AI to follow a rigid path.
  • It lets the AI adjust its strategy (prediction type), its map (integration domain), and its corrections (residuals) on the fly.
  • It learns by asking "Is this a good image?" rather than "Did you copy my steps?"

The result? You get high-quality AI art in a fraction of the time, making it much faster and cheaper to generate images.