Tree codes and sort-and-sweep algorithms for neighborhood computation: A cache-conscious comparison

This paper compares cache-conscious sort-and-sweep and tree-code algorithms for neighborhood computation in two-dimensional discrete element method simulations, finding that while tree codes offer slightly better performance and improved parallelization potential, they come at the cost of significantly increased code complexity.

Dominik Krengel, Yuki Watanabe, Ko Kandori, Jian Chen, Hans-Georg Matuttis

Published 2026-03-06
📖 5 min read🧠 Deep dive

Imagine you are the manager of a massive, chaotic dance floor filled with thousands of spinning, bouncing dancers (these are the particles in a computer simulation). Your job is to figure out who is bumping into whom so you can calculate the physics of the crash.

If you tried to check every single dancer against every other dancer to see if they are touching, you would have to do millions of checks. That would take forever, and your computer would get tired and slow down. This is the problem the paper solves.

The researchers compared two different "strategies" for organizing this dance floor to find collisions quickly: The "Sort-and-Sweep" Method and The "Tree Code" Method.

The Two Strategies

1. The "Sort-and-Sweep" Method (The Line-Up)

Imagine you have a long line of dancers. To find out who is touching, you line them all up from left to right based on their position.

  • How it works: You scan the line. If two dancers are next to each other in the line, you check if they are touching. If they are far apart in the line, you ignore them.
  • The Catch: Even though you only check neighbors, you have to re-sort the entire line every single time a dancer moves, even if they only moved a tiny bit. It's like having to re-organize the entire library every time one book is moved a few inches. It's efficient, but it involves a lot of "administrative work" (re-sorting) that doesn't always add value.

2. The "Tree Code" Method (The Family Tree)

Imagine instead of a line, you have a giant, magical tree structure.

  • How it works: You divide the dance floor into big squares. If a square is empty, you ignore it. If a square is crowded, you split it into four smaller squares, and then split those again until you find the tiny groups of dancers.
  • The Magic: When a dancer moves, you don't re-sort everyone. You just tell that one dancer, "Hey, you moved to the next room," and you update their spot on the tree. You only look at the specific branches of the tree where the action is happening.
  • The Benefit: It's much faster at updating because you only touch the parts of the system that actually changed.

The Big Race: Who Wins?

The researchers put these two methods to the test using a computer simulation of a rotating drum filled with up to 12,000 polygon-shaped particles (think of them as slightly stretched-out coins).

The Results:

  • Speed: The Tree Code was the winner. It was about 10% faster overall.
  • The Real Winner: The updating part of the Tree Code was 10 times faster than the Sort-and-Sweep method.
  • Why? In the Sort-and-Sweep method, the computer spends a lot of time shuffling lists around. In the Tree Code, the computer spends its time doing the actual work of checking collisions.

The Hidden Cost: "Cognitive Load"

Here is the twist. While the Tree Code is faster, it is much harder to write and understand.

  • Analogy: Imagine the Sort-and-Sweep method is like a simple recipe for a sandwich: "Put bread, then cheese, then ham." It's easy to follow.
  • The Tree Code is like a recipe for a complex soufflé that requires you to juggle 10 different bowls, check temperatures constantly, and fold ingredients in specific ways. It works better, but if you make a mistake, the whole thing collapses.

In computer science terms, the Tree Code has high "Cyclomatic Complexity." This is a fancy way of saying the code is a tangled web of "if this, then that" decisions. It's so complex that the authors joked it might be "untestable" by normal standards. However, for high-speed simulations, they decided the speed was worth the headache.

The "Cache" Factor (The Kitchen Counter)

The paper also talked about something called Cache Memory.

  • Analogy: Think of the CPU (the brain) as a chef and the Memory (RAM) as the pantry. The Cache is the kitchen counter right in front of the chef.
  • If the chef has to run to the pantry (RAM) every time they need an ingredient, they waste time. If they keep all the ingredients on the counter (Cache), they cook fast.
  • The researchers found that the Tree Code was better at keeping the necessary data on the "counter" (Cache), which is why it ran so smoothly, even on different types of computer chips (Intel vs. Apple Silicon).

The "Inlining" Trick

They also tried a trick called Inlining.

  • Analogy: Usually, when a chef needs a tool, they ask an assistant to fetch it (a function call). Inlining is like the chef keeping the tool in their own hand so they don't have to ask.
  • For small dance floors, this didn't help much. But for huge crowds (over 10,000 particles), keeping the tools in hand (Inlining) made the Tree Code even faster, though it made the "recipe" (the code) even more complicated.

The Bottom Line

If you are simulating a system with thousands of moving parts (like granular sand, rocks, or particles in a factory):

  1. Tree Codes are the speed demons. They are faster and better for parallel processing (using many computer cores at once).
  2. Sort-and-Sweep is the reliable, easier-to-maintain option, but it gets slower as the crowd gets bigger.
  3. The Trade-off: You get speed, but you have to deal with code that is incredibly complex and hard to debug.

The authors conclude that for modern, high-performance simulations, the Tree Code is the better choice, provided you have a team of programmers brave enough to handle the complexity!