bsort: A theoretically efficient non-comparison-based sorting algorithm for integer and floating-point numbers

The paper introduces bsort, a non-comparison-based sorting algorithm that unifies the sorting of signed/unsigned integers and floating-point numbers with an asymptotic time complexity of O(wn)O(wn) and auxiliary space of O(w)O(w), demonstrating competitive performance against optimized hybrid libraries for small word sizes.

Benjamín Guzmán

Published Wed, 11 Ma
📖 6 min read🧠 Deep dive

Imagine you have a massive, chaotic pile of library books. Your goal is to organize them alphabetically.

Most people (and most computer programs) use a method called Comparison Sorting. This is like asking a librarian: "Is Book A before Book B?" They pick up two books, compare them, swap them if needed, and repeat this millions of times. It works, but it's slow because you have to check every single pair against every other pair. The paper calls this the "Ω(n log n)" limit—a fancy way of saying, "You can't get much faster than this using just comparisons."

The author of this paper, Benjamín Guzmán, introduces a new method called bsort. Instead of asking "Is A before B?", bsort looks at the books' spines and sorts them by their letters, one by one, from left to right.

Here is the simple breakdown of how it works, why it's special, and where it struggles.

1. The Core Idea: Sorting by "Bits"

Think of every number (whether it's an integer like 42 or a decimal like 3.14) as a long string of light switches (bits). Some switches are ON (1), and some are OFF (0).

  • The Old Way (Comparison): You pick two numbers and ask, "Which is bigger?"
  • The bsort Way: You look at the first switch (the most significant bit) of all the numbers at once.
    • If the switch is OFF (0), you put that number in the "Left Pile."
    • If the switch is ON (1), you put that number in the "Right Pile."

Once you've sorted everyone into Left and Right piles based on that first switch, you take the Left Pile and look at the second switch. You split them again. Then the third switch, and so on, until you've checked every single switch.

The Analogy: Imagine sorting a crowd of people by height.

  • Comparison Sort: You grab two people, ask "Who is taller?", and swap them. You do this until everyone is in line.
  • bsort: You ask everyone to step to the left if they are under 5 feet, and to the right if they are over. Then, within the "under 5 feet" group, you ask who is under 4 feet, and so on. You are sorting by "height buckets" rather than comparing individuals.

2. The Magic Trick: Handling Negative Numbers and Decimals

The tricky part is that computers store negative numbers and decimals (floating-point numbers) in a weird way that doesn't look like normal math when you just look at the switches.

  • The Problem: If you just look at the switches, a negative number might look "bigger" than a positive one because of how the first switch is set. It's like if the library sorted books by the color of the spine, but the red spines (negative numbers) were supposed to go after the blue ones, even though the sorting machine thinks red comes first.
  • The bsort Solution: The author realized that for the very first step (the first switch), you just need to flip the rules.
    • For the first switch, tell the "ON" group to go to the Left and the "OFF" group to the Right.
    • For all the subsequent switches, go back to the normal rules.
    • For decimals, the algorithm sorts them in three stages: Sign (Positive/Negative) \rightarrow Exponent (How big the number is) \rightarrow Mantissa (The specific digits). It's like sorting books first by "Fiction vs. Non-Fiction," then by "Genre," then by "Author Name."

3. Why It's Theoretically Awesome

In the world of computer science, this algorithm is a dream come true for specific situations.

  • Speed: It runs in O(n × w) time. If n is the number of items and w is the number of bits (switches), this is basically linear. If you double the number of books, the time doubles. It doesn't get exponentially slower like the comparison method.
  • Memory: It is in-place. It doesn't need a second pile of books to sort; it just shuffles the books around in the original pile. It uses almost no extra memory.

4. The Catch: Why It's Not Perfect Yet

If it's so fast, why isn't it in every computer right now? The paper admits that while the math says it should be the fastest, the real world (computer hardware) has some quirks.

  • The "Guessing Game" Problem: Modern computers are like super-fast chefs who guess what ingredient they need next to keep cooking. bsort asks a lot of "If this switch is ON, go left; if OFF, go right" questions. On random data, the computer can't guess right, so it keeps stopping to reset its brain (called branch misprediction). This slows it down.
  • The "Too Many Steps" Problem: Because bsort looks at every single bit for every single number, it has to scan the whole pile of books 64 times (for a 64-bit number). Other smart algorithms (called "Hybrids") stop scanning early if the pile is small enough and switch to a faster, simpler method. bsort keeps going until the very last bit, which creates a lot of unnecessary work.
  • The "Messy Desk" Problem: The way bsort is written uses a lot of "recursion" (calling itself over and over). This fills up the computer's short-term memory (cache) with "to-do lists" instead of the actual data, causing the computer to trip over its own feet.

5. The Verdict

bsort is a brilliant, unified sorting method that can handle integers and decimals with the same logic.

  • When it wins: It is incredibly fast for small numbers (like 8-bit or 16-bit data) where the "switches" are few.
  • When it loses: For huge numbers (64-bit), the overhead of checking every single bit and the computer's inability to guess the next step makes it slightly slower than the current "champions" (like std::sort in C++).

The Future: The author suggests that if we combine bsort with the "smart stopping" of hybrid algorithms and use special computer instructions (SIMD) to check multiple bits at once, it could become the fastest sorting algorithm in the world.

In a nutshell: bsort is like a super-organized librarian who sorts books by scanning their barcodes bit-by-bit. It's theoretically perfect and uses almost no space, but right now, it's a bit too rigid and gets confused by the computer's speed, making it slightly slower than the current top-tier librarians for very large libraries. But with a few upgrades, it could take the crown.