Better Understandings and Configurations in MaxSAT Local Search Solvers via Anytime Performance Analysis

This paper proposes using Empirical Cumulative Distribution Functions to evaluate MaxSAT solvers' anytime performance, demonstrating that this approach reveals dynamic performance distinctions and enables automatic configurators like SMAC to discover superior parameter settings compared to traditional final-solution metrics.

Furong Ye, Chuan Luo, Shaowei Cai

Published 2026-04-09
📖 4 min read☕ Coffee break read

Imagine you are organizing a massive cooking competition to find the best recipe for a complex dish (let's call it the "MaxSAT Problem"). You have four famous chefs (the solvers: NuWLS, BandMax, MaxFPS, and SATLike) and a strict time limit of 5 minutes to cook.

The Old Way: The "Final Plate" Judgment

Traditionally, judges would only look at the final plate presented at the exact moment the 5-minute timer beeped.

  • The Rule: "Whoever has the tastiest dish at 5:00 wins."
  • The Problem: This ignores the journey. Did Chef A start with a burnt mess but slowly improve? Did Chef B have a great dish at 2 minutes but then accidentally dropped it? Did Chef C take a long time to prep but cook incredibly fast once started?
  • The Result: By only looking at the final second, the judges might miss who is actually the most efficient or promising cook. They might declare two chefs "equal" because their final dishes look the same, even though one got there much faster.

The New Idea: The "Cooking Video" (Anytime Performance)

This paper suggests a better way to judge: Watch the whole cooking video.

Instead of just checking the dish at 5:00, the authors propose using a tool called ECDF (Empirical Cumulative Distribution Function). Think of this as a speedometer for quality.

  • How it works: Instead of asking, "How good is the dish at 5 minutes?", we ask, "At what time did the chef reach 50% of their potential? 80%? 90%?"
  • The Analogy: Imagine a graph where the X-axis is Time and the Y-axis is Quality.
    • Chef NuWLS might be a slow starter but climbs a steep hill, reaching high quality very quickly.
    • Chef BandMax might start fast but get stuck on a plateau.
    • Chef SATLike might be consistent but slow.
  • The Benefit: This method reveals the "personality" of each solver. It shows that while NuWLS might be the best overall, BandMax might actually be the best if you only have 10 seconds. It turns a single score into a rich story of progress.

The "Magic Tuning Knob" (Hyperparameter Optimization)

Now, imagine these chefs have secret recipe books with adjustable knobs (parameters). To make them better, you need to tweak these knobs.

  • The Old Tuning Method: You tweak the knobs, run the chef for 5 minutes, and see if the final dish is better. If two settings produce the same final dish, you can't tell which one was "smarter" or more efficient.
  • The New Tuning Method (AUC): The authors used a tool called SMAC (a smart robot tuner). Instead of just looking at the final dish, the robot watched the entire cooking process (the ECDF curve).
    • It calculated the Area Under the Curve (AUC). Think of this as the total "deliciousness" accumulated over the whole 5 minutes.
    • The Discovery: When the robot tuned the chefs based on the whole video (AUC) rather than just the final plate, it found better settings!
    • Why? Because the "whole video" method gives the robot more clues. If two settings produce the same final dish, the one that got there faster (higher AUC) is clearly superior. The old method couldn't see the difference.

The Takeaway

  1. Don't just look at the finish line. In complex problem-solving, how you get there matters just as much as where you end up.
  2. NuWLS is the current champion, but the other chefs have their own strengths depending on how much time you give them.
  3. Tuning with "Time" in mind works better. If you want to build the best AI solver, don't just ask, "Is the answer good?" Ask, "How fast and smoothly did it get to the answer?"

In a nutshell: This paper teaches us that to truly understand and improve these problem-solving algorithms, we need to stop treating them like a sprinter who only matters at the finish line, and start treating them like a marathon runner where the pace, the strategy, and the consistency throughout the race are what truly define the winner.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →