A Lock-Free Work-Stealing Algorithm for Bulk Operations

This paper presents a specialized lock-free work-stealing queue designed for a master-worker framework in mixed-integer programming solvers that leverages restricted concurrency assumptions to support native bulk operations and achieve constant-latency push performance, significantly outperforming general-purpose implementations like C++ Taskflow in batch processing scenarios.

Raja Sai Nandhan Yadav Kataru, Danial Davarnia, Ali Jannesari

Published Mon, 09 Ma
📖 5 min read🧠 Deep dive

Imagine you are running a massive, high-stakes construction project. You have a team of workers (the computer cores) and a foreman (the master thread). Their job is to build a complex structure by breaking it down into thousands of tiny tasks (like laying bricks or pouring concrete).

In a perfect world, every worker would have exactly the same amount of work. But in reality, some tasks are quick, and some take forever. This creates a problem: some workers finish early and sit idle, while others are drowning in work.

The Problem: The Old Way of Sharing Work

To fix this, computer scientists use a technique called "Work Stealing."
Think of it like a buffet. Each worker has their own private plate (a queue).

  • The Owner: A worker puts new tasks they create onto the top of their plate.
  • The Stealer: If a worker runs out of food, they look at their neighbor's plate and "steal" a few tasks from the bottom to keep working.

The Issue: Most existing systems are built like a chaotic, crowded cafeteria.

  1. One at a time: If you need 100 tasks, you have to walk over, grab one, walk back, grab another, and repeat 100 times. This is slow and tiring.
  2. Too many thieves: In general systems, anyone can steal from anyone at any time. This causes a traffic jam at the buffet counter (contention), where everyone is bumping into each other.
  3. Fragile plates: Some plates are fixed size. If you get too much food, the plate breaks and you have to find a bigger one and move everything over (resizing), which wastes time.

The Solution: A Specialized "Bulk" Conveyor Belt

The authors of this paper built a brand new, specialized conveyor belt just for their specific type of construction project (solving complex math problems called Mixed-Integer Programming).

Here is how their new system works, using simple analogies:

1. The "Bulk" Delivery (Native Bulk Operations)

Instead of grabbing one brick at a time, imagine a worker creates a whole pallet of 100 bricks.

  • Old Way: The worker has to carry one brick, drop it, go back, carry another.
  • New Way: The worker slides the entire pallet onto the conveyor belt in one smooth motion.
  • Result: The time it takes to deliver 100 bricks is almost the same as delivering 1. This is called constant latency. It doesn't matter how big the batch is; the speed stays the same.

2. The "Single Thief" Rule (One Owner, One Stealer)

In their specific project, there is only one foreman responsible for balancing the load.

  • Old Way: Imagine 50 people trying to grab food from the same buffet line at once. It's a mess.
  • New Way: Only the foreman is allowed to steal. He walks down the line, sees a worker with too much, and takes a big chunk.
  • Result: Because there's only one thief, there's no fighting or traffic jams. The system doesn't need heavy locks or security guards (synchronization) to keep order. It's much faster and lighter.

3. The "Infinite" Conveyor (Unbounded Growth)

Sometimes, a single task explodes into hundreds of new tasks instantly.

  • Old Way: The plate is too small. You have to stop, find a bigger plate, and carefully move every single item to the new one.
  • New Way: The conveyor belt is made of magic links. It can grow infinitely long without ever needing to stop and reorganize. You just keep adding links.

The Results: Why It Matters

The authors tested their new conveyor belt against the standard ones used in big software libraries (like C++ Taskflow).

  • Pushing Tasks: When they added huge batches of tasks, the old systems got slower and slower (like a traffic jam). The new system stayed fast and steady, no matter how big the batch was.
  • Stealing Tasks: When the foreman needed to steal a large portion of work (say, 50% of a worker's plate), the old systems got very slow. The new system was stable and fast.
  • The "Optimized" Trick: They found that if the worker wasn't busy, the foreman could steal even faster by skipping a step. This made the process 3 times faster in common situations.

The Bottom Line

This paper isn't about building a better tool for every job in the world. It's about building the perfect tool for a very specific, messy job.

By realizing that their specific problem only needed one foreman and bulk deliveries, they were able to throw away all the heavy, complicated rules that general systems need. The result is a system that is incredibly efficient for their specific type of math-solving, proving that sometimes, the best way to be fast is to stop trying to be a "one-size-fits-all" solution and just build exactly what you need.