Adaptive Candidate Point Thompson Sampling for High-Dimensional Bayesian Optimization

This paper introduces Adaptive Candidate Thompson Sampling (ACTS), a method that improves high-dimensional Bayesian optimization by adaptively reducing the search space during candidate point generation using surrogate model gradients, thereby overcoming the sparsity issues inherent in fixed discretizations.

Original authors: Donney Fan, Geoff Pleiss

Published 2026-04-13
📖 5 min read🧠 Deep dive

This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

Imagine you are trying to find the highest peak in a massive, foggy mountain range. You can't see the whole landscape, and every time you want to check the height of a spot, you have to send a very expensive, slow drone to fly there and report back. This is the problem of Bayesian Optimization: finding the best solution (the highest peak) with as few expensive checks as possible.

For a long time, computers have been good at this in small, simple mountains (low dimensions). But when the mountain range gets huge and complex (high dimensions, like tuning a machine learning model with thousands of settings), the old methods break down.

Here is a simple breakdown of the paper's solution, Adaptive Candidate Thompson Sampling (ACTS), using some everyday analogies.

The Problem: The "Grid Search" Trap

Imagine you are looking for the highest point in a 100-dimensional space. To do this, a computer usually picks a bunch of random spots to check (like dropping pins on a map).

  • The Old Way: If you have a 2D map, you can drop 100 pins and cover the whole area nicely. But if you have a 100D map, you need astronomical numbers of pins just to cover the space even a little bit. It's like trying to find a specific grain of sand on a beach by dropping a few grains of sand randomly; you'll likely miss the spot entirely.
  • The Result: The computer picks pins that are far apart, misses the "good" areas, and wastes its budget checking empty spots.

The Solution: ACTS (The "Smart Hiker")

The authors propose a new way to pick those pins. Instead of dropping them randomly everywhere, they use a "Smart Hiker" strategy.

1. The "Ghost Map" (The Posterior Sample)

The computer builds a "Ghost Map" (a mathematical model) of the mountain based on what it has seen so far. It knows where it's been, but it's guessing what the rest looks like.

  • Old Method: The computer picks a random spot on this Ghost Map and says, "Let's go there!" But because the map is so big, the random spot might be in a valley or a flat plain, far from the peak.

2. The "Compass" (The Gradient)

This is the magic of ACTS. Before picking a spot, the computer asks the Ghost Map: "If I were standing right here, which way would the ground go UP?"

  • It calculates a gradient (a compass needle pointing uphill).
  • Crucially, this compass is based on a random guess of what the mountain looks like. Sometimes the compass points North, sometimes South. This randomness is good because it keeps the search from getting stuck in one spot.

3. The "Flashlight Cone" (The Adaptive Search Space)

Instead of looking at the entire mountain, the computer turns on a flashlight.

  • It shines the flashlight only in the direction the compass is pointing.
  • It creates a cone (a narrow, focused search area) starting from where it is now and stretching out in that uphill direction.
  • The Analogy: Imagine you are in a dark forest. Instead of walking randomly in every direction, you look at your compass, see which way is "uphill," and only drop your pins in a narrow cone in front of you.

4. Packing the Pins (Density)

Because the computer is only looking at this tiny "cone" instead of the whole mountain, it can drop way more pins in that small area.

  • The Benefit: It's like zooming in on a map. If you zoom in, you can see the details. By focusing only on the "uphill" direction, the computer finds the local peak of that specific "Ghost Map" much more accurately.

Why is this better?

  • Efficiency: It stops wasting time checking flat areas or valleys.
  • Accuracy: It finds the "best" spot on the computer's current guess much faster.
  • Safety: You might worry, "What if the compass points the wrong way and we miss the real peak?" The paper proves that because the "Ghost Map" changes every time (it's random), the compass will eventually point in the right direction. It won't get stuck in a local loop forever; it will eventually explore the whole mountain.

The Real-World Result

The authors tested this on real problems, like tuning robot controllers and designing new molecules.

  • Before: The computer was like a tourist wandering aimlessly in a huge city, hoping to stumble upon the best restaurant.
  • With ACTS: The computer is like a food critic who asks a local for a direction, then focuses their search entirely on that specific neighborhood, checking every single restaurant there before moving on.

In short: ACTS solves the "too many choices" problem by using a smart, temporary compass to narrow the search down to the most promising direction, allowing the computer to look much more closely at the right place.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →