Optimal Best-Arm Identification under Fixed Confidence with Multiple Optima

This paper establishes a tighter information-theoretic lower bound and proposes a modified Track-and-Stop algorithm with a tie-aware stopping rule that achieves asymptotic instance-optimality for best-arm identification in stochastic multi-armed bandits when the number of optimal arms is known.

Lan V. Truong

Published 2026-03-05
📖 4 min read☕ Coffee break read

Imagine you are a detective trying to find the best suspect in a lineup of KK people. However, there's a twist: you don't know if there is just one best suspect, or if there are several suspects who are equally guilty (and equally "best").

Your goal is to identify any one of these "best" suspects with high confidence, but you want to do it by asking as few questions as possible. Every question you ask costs time and money (this is called "sample complexity").

This paper is about solving that detective story when you have a secret piece of information: You know exactly how many "best" suspects there are.

Here is the breakdown of the paper's story, using simple analogies:

1. The Problem: The "Tie" Dilemma

In the past, most detective stories assumed there was only one true winner. Algorithms were built to keep asking questions until they were 100% sure who the single winner was.

But in real life, ties happen.

  • Example: Imagine you are testing 10 different flavors of ice cream. Maybe three of them are all tied for "Best."
  • The Old Way: If you didn't know there were three winners, your algorithm would keep tasting the top three flavors over and over again, trying to figure out which one is slightly better than the others. This is a waste of time! You just need to find any of the three winners.
  • The Gap: Previous research figured out how to handle ties when you didn't know how many winners there were. But nobody had figured out the mathematically perfect way to do it when you do know the number of winners in advance.

2. The New Discovery: The "Tighter" Lower Bound

The author, Lan V. Truong, asks: "If I tell you there are exactly 3 winners, can we do better than if I just said 'there are some winners'?"

The Answer is Yes.

The paper derives a new Information-Theoretic Lower Bound.

  • The Metaphor: Think of this as the "Speed Limit" for your investigation.
  • The Old Speed Limit: "You must ask at least 1,000 questions to be sure."
  • The New Speed Limit: "Because you know there are exactly 3 winners, you only need to ask 800 questions."

The paper proves that knowing the number of winners allows you to stop the investigation earlier. It mathematically calculates the absolute minimum number of questions needed, which is strictly lower (better) than previous methods.

3. The Solution: A Smarter Detective (Track-and-Stop)

The paper proposes a modified version of a famous algorithm called Track-and-Stop.

  • How it works:
    1. Tracking: The detective keeps a running tally of who looks like a winner.
    2. The "Tie-Aware" Twist: Because the detective knows there are MM winners, it stops wasting energy trying to rank the winners against each other. Instead, it focuses its energy on proving that the current top group is definitely better than the "losers."
    3. Stopping: It uses a special "Stop Sign" (a statistical rule). As soon as the evidence is strong enough to say, "These MM people are the best, and everyone else is worse," it stops. It doesn't care which of the MM is the absolute best; it just picks one and says, "Done!"

4. Why This Matters (The "So What?")

This isn't just about ice cream or detectives. This logic applies to:

  • Clinical Trials: If three different drugs work equally well, you don't need to run expensive tests to see which one is slightly better. You just need to confirm they are all better than the placebo and pick one. This saves millions of dollars and time.
  • A/B Testing: If you are testing website designs and find three that perform equally well, you can stop testing immediately and launch any of them.
  • Hyperparameter Tuning: In AI, if you find three settings that give the same best result, you can stop searching and use one.

Summary in One Sentence

This paper proves that if you know how many "winners" exist in a competition, you can design a smarter strategy to find one of them much faster and cheaper than if you were guessing how many there were.

The Takeaway: Knowledge of the "tie count" is a superpower that lets you stop the game earlier and win with fewer moves.