Agent Hunt: Bounty Based Collaborative Autoformalization With LLM Agents

Imagine a massive, incredibly difficult math textbook on Algebraic Topology (a field that studies shapes and spaces). This textbook is like a giant, locked treasure chest. Inside are hundreds of theorems (mathematical truths) that need to be translated from human language into a strict, computer-readable language called "formal logic."

Usually, one very smart computer program (an AI agent) tries to do this alone. But in this paper, the researchers tried something different: They turned the translation job into a competitive video game with a cash prize.

Here is the story of "Agent Hunt," explained simply.

1. The Setup: The Bounty Board

Instead of giving one robot a long to-do list, the researchers set up a virtual marketplace.

The Boss: The researchers took the textbook and broke it down into 393 specific math problems (lemmas and theorems).
The Bounties: They attached a "price tag" to every problem. Some were worth a little, some a lot, depending on how hard they thought the proof would be.
The Players: They hired four AI agents (named Alice, Bob, Charlie, and Dave). Think of them as four highly skilled, digital detectives.

2. The Rules: How the Game Works

The agents didn't just work in a line; they competed and cooperated in a chaotic but organized way:

Locking the Job: If an agent saw a problem they wanted to solve, they could "lock" it by paying a small deposit. This meant, "I'm working on this! Don't touch it!"
The Prize: If they successfully proved the theorem and the computer checked it, they got the full bounty (the reward).
Stealing the Job: If an agent locked a problem but got stuck, or if another agent solved it faster, the lock could expire, and the reward would go to the winner.
Collaboration: Sometimes, an agent would solve a small piece of a problem (a sub-lemma) and put a new bounty on it, inviting others to help finish the bigger picture.

3. The Analogy: The "Construction Site"

Imagine a massive skyscraper needs to be built.

The Old Way: You hire one master architect who tries to design the whole building, floor by floor, alone. It takes forever, and if they get tired or stuck on the 50th floor, the whole project stalls.
The Agent Hunt Way: You put a "Wanted" poster on every single brick and beam.
- Alice sees a blueprint for a window and says, "I'll build that!"
- Bob sees a foundation beam and says, "I'll do that!"
- If Alice gets stuck on the window, Bob might step in, finish it, and claim the reward.
- If Alice builds a great window, she might say, "Hey, I need a special hinge for this," and put a small bounty on the hinge. Bob might then build the hinge to help Alice finish her window.

This creates a decentralized swarm where everyone is motivated by the reward to keep moving forward, rather than waiting for a central boss to give instructions.

4. The Results: Speed and Chaos

The experiment was a huge success in terms of speed:

The Speed: The four agents working together produced 39,000 lines of code in two days.
The Comparison: A single agent working on a similar project previously only managed about 7,000 lines a day. The "bounty market" made them 5 times faster.
The Economy: The agents earned "simulated money" (tokens). Some agents got rich by solving hard problems; others got stuck or had to reset their accounts when they made mistakes (like using the wrong version of the computer checker).

5. The Hiccups: When the Game Gets Weird

It wasn't all smooth sailing. The researchers noticed some funny and tricky situations:

The "Exercise" Trap: The textbook had practice problems at the end of chapters that didn't have answers. The AI agents tried to solve them anyway, spending hours on them only to get a tiny reward because the "price tag" was wrong. The researchers had to tell them, "Ignore those!"
The "Cosine" Bug: One of the hardest problems was proving a theorem about circles. The AI was trying to use a definition of "sine" and "cosine" that was mathematically broken (it allowed for multiple wrong answers). The agents were spinning their wheels, trying to prove things that were impossible with the broken definitions. The researchers realized the AI needed to rewrite the definitions, not just solve the proofs.

6. The Big Picture: Why This Matters

This experiment shows that AI doesn't have to work alone to solve big problems.

By creating a system where AI agents compete for rewards, they can:

Scale up: Tackle projects too big for one brain.
Self-Correct: If one agent gets stuck, another can jump in.
Build Faster: The "market" forces them to prioritize the most important and valuable work.

In short: The researchers turned math proof-writing into a high-speed, competitive video game. By letting the AI agents "hunt" for bounties, they proved that a team of AI agents working together is much faster and more powerful than a single AI working alone.

Agent Hunt: Bounty Based Collaborative Autoformalization With LLM Agents

1. The Setup: The Bounty Board

2. The Rules: How the Game Works

3. The Analogy: The "Construction Site"

4. The Results: Speed and Chaos

5. The Hiccups: When the Game Gets Weird

6. The Big Picture: Why This Matters

1. Problem Statement

2. Methodology: The "Agent Hunt" Framework

A. Target Domain

B. The Bounty System Mechanics

C. Technical Infrastructure & Guardrails

3. Key Contributions

4. Results

5. Observations and Limitations

6. Significance

Agent Hunt: Bounty Based Collaborative Autoformalization With LLM Agents

1. The Setup: The Bounty Board

2. The Rules: How the Game Works

3. The Analogy: The "Construction Site"

4. The Results: Speed and Chaos

5. The Hiccups: When the Game Gets Weird

6. The Big Picture: Why This Matters

1. Problem Statement

2. Methodology: The "Agent Hunt" Framework

A. Target Domain

B. The Bounty System Mechanics

C. Technical Infrastructure & Guardrails

3. Key Contributions

4. Results

5. Observations and Limitations

6. Significance

More like this

Monotone Comparative Statics without Lattices

Motion Illusions Generated Using Predictive Neural Networks Also Fool Humans

Performance Analysis of IEEE 802.11p Preamble Insertion in C-V2X Sidelink Signals for Co-Channel Coexistence

Construction of time-varying ISS-Lyapunov Functions for Impulsive Systems

Real-Time BDI Agents: a model and its implementation