bsort: A theoretically efficient non-comparison-based sorting algorithm for integer and floating-point numbers

Imagine you have a massive, chaotic pile of library books. Your goal is to organize them alphabetically.

Most people (and most computer programs) use a method called Comparison Sorting. This is like asking a librarian: "Is Book A before Book B?" They pick up two books, compare them, swap them if needed, and repeat this millions of times. It works, but it's slow because you have to check every single pair against every other pair. The paper calls this the "Ω(n log n)" limit—a fancy way of saying, "You can't get much faster than this using just comparisons."

The author of this paper, Benjamín Guzmán, introduces a new method called bsort. Instead of asking "Is A before B?", bsort looks at the books' spines and sorts them by their letters, one by one, from left to right.

Here is the simple breakdown of how it works, why it's special, and where it struggles.

1. The Core Idea: Sorting by "Bits"

Think of every number (whether it's an integer like 42 or a decimal like 3.14) as a long string of light switches (bits). Some switches are ON (1), and some are OFF (0).

The Old Way (Comparison): You pick two numbers and ask, "Which is bigger?"
The bsort Way: You look at the first switch (the most significant bit) of all the numbers at once.
- If the switch is OFF (0), you put that number in the "Left Pile."
- If the switch is ON (1), you put that number in the "Right Pile."

Once you've sorted everyone into Left and Right piles based on that first switch, you take the Left Pile and look at the second switch. You split them again. Then the third switch, and so on, until you've checked every single switch.

The Analogy: Imagine sorting a crowd of people by height.

Comparison Sort: You grab two people, ask "Who is taller?", and swap them. You do this until everyone is in line.
bsort: You ask everyone to step to the left if they are under 5 feet, and to the right if they are over. Then, within the "under 5 feet" group, you ask who is under 4 feet, and so on. You are sorting by "height buckets" rather than comparing individuals.

2. The Magic Trick: Handling Negative Numbers and Decimals

The tricky part is that computers store negative numbers and decimals (floating-point numbers) in a weird way that doesn't look like normal math when you just look at the switches.

The Problem: If you just look at the switches, a negative number might look "bigger" than a positive one because of how the first switch is set. It's like if the library sorted books by the color of the spine, but the red spines (negative numbers) were supposed to go after the blue ones, even though the sorting machine thinks red comes first.
The bsort Solution: The author realized that for the very first step (the first switch), you just need to flip the rules.
- For the first switch, tell the "ON" group to go to the Left and the "OFF" group to the Right.
- For all the subsequent switches, go back to the normal rules.
- For decimals, the algorithm sorts them in three stages: Sign (Positive/Negative) $\rightarrow$ Exponent (How big the number is) $\rightarrow$ Mantissa (The specific digits). It's like sorting books first by "Fiction vs. Non-Fiction," then by "Genre," then by "Author Name."

3. Why It's Theoretically Awesome

In the world of computer science, this algorithm is a dream come true for specific situations.

Speed: It runs in O(n × w) time. If n is the number of items and w is the number of bits (switches), this is basically linear. If you double the number of books, the time doubles. It doesn't get exponentially slower like the comparison method.
Memory: It is in-place. It doesn't need a second pile of books to sort; it just shuffles the books around in the original pile. It uses almost no extra memory.

4. The Catch: Why It's Not Perfect Yet

If it's so fast, why isn't it in every computer right now? The paper admits that while the math says it should be the fastest, the real world (computer hardware) has some quirks.

The "Guessing Game" Problem: Modern computers are like super-fast chefs who guess what ingredient they need next to keep cooking. bsort asks a lot of "If this switch is ON, go left; if OFF, go right" questions. On random data, the computer can't guess right, so it keeps stopping to reset its brain (called branch misprediction). This slows it down.
The "Too Many Steps" Problem: Because bsort looks at every single bit for every single number, it has to scan the whole pile of books 64 times (for a 64-bit number). Other smart algorithms (called "Hybrids") stop scanning early if the pile is small enough and switch to a faster, simpler method. bsort keeps going until the very last bit, which creates a lot of unnecessary work.
The "Messy Desk" Problem: The way bsort is written uses a lot of "recursion" (calling itself over and over). This fills up the computer's short-term memory (cache) with "to-do lists" instead of the actual data, causing the computer to trip over its own feet.

5. The Verdict

bsort is a brilliant, unified sorting method that can handle integers and decimals with the same logic.

When it wins: It is incredibly fast for small numbers (like 8-bit or 16-bit data) where the "switches" are few.
When it loses: For huge numbers (64-bit), the overhead of checking every single bit and the computer's inability to guess the next step makes it slightly slower than the current "champions" (like std::sort in C++).

The Future: The author suggests that if we combine bsort with the "smart stopping" of hybrid algorithms and use special computer instructions (SIMD) to check multiple bits at once, it could become the fastest sorting algorithm in the world.

In a nutshell: bsort is like a super-organized librarian who sorts books by scanning their barcodes bit-by-bit. It's theoretically perfect and uses almost no space, but right now, it's a bit too rigid and gets confused by the computer's speed, making it slightly slower than the current top-tier librarians for very large libraries. But with a few upgrades, it could take the crown.

Here is a detailed technical summary of the paper "bsort: A theoretically efficient non-comparison-based sorting algorithm for integer and floating-point numbers."

1. Problem Statement

Sorting is a fundamental computer science problem. While comparison-based algorithms (like Quicksort or Introsort) have a theoretical lower bound of $O(n \log n)$ , non-comparison-based algorithms (like Radix Sort) can achieve linear time $O(n)$ . However, existing non-comparison methods face specific limitations:

Binary Quicksort: Efficient and in-place but traditionally limited to equally-signed integers. It fails on mixed-sign integers and floating-point numbers because standard bitwise lexicographical ordering does not match numerical ordering for these types (e.g., negative numbers have a '1' in the Most Significant Bit (MSB), placing them "after" positive numbers in a naive bitwise sort).
Existing Solutions: Handling signed integers or floating-point values often requires complex, out-of-place methods or separate algorithms for different data types, lacking a unified, in-place approach.

The paper aims to present bsort, a unified, in-place, non-comparison-based algorithm capable of sorting signed integers, unsigned integers, and floating-point values with $O(wn)$ time complexity and $O(w)$ auxiliary space, where $w$ is the word size.

2. Methodology

The core of bsort is a modified Binary Quicksort that operates recursively on bits from the Most Significant Bit (MSB) to the Least Significant Bit (LSB).

A. Core Mechanism: Bitwise Partitioning

The algorithm uses a singlePassBSort function to partition an array based on a specific bit mask. Elements are swapped such that those with a '0' at the current bit position are grouped on one side, and '1's on the other. This is repeated recursively for the next bit.

B. Handling Signed Integers

Standard binary quicksort fails on signed integers because the MSB acts as a sign bit (1 for negative, 0 for positive), inverting the natural order.

Solution: The algorithm inverts the sort direction for the first pass (the MSB). By sorting the sign bit in descending order (placing negatives before positives for an ascending overall sort), it corrects the initial partition. Subsequent passes on lower bits proceed normally.

C. Handling Floating-Point Values

Floating-point numbers (IEEE-754) are more complex due to the sign, exponent, and mantissa fields. The paper proves that sorting must follow a strict hierarchical order: Sign $\to$ Exponent $\to$ Mantissa.

Sign Sort: Similar to signed integers, the first pass inverts the direction to separate negative and non-negative numbers.
Exponent Sort: The algorithm recursively sorts the exponent bits. Crucially, for negative numbers, the exponent must be sorted in descending order (larger magnitude negative numbers have smaller exponents in two's complement logic relative to the sign), while positive numbers are sorted in ascending order.
Mantissa Sort: Once sign and exponent are fixed, the remaining bits (mantissa) are sorted as unsigned integers.

Special Values: The algorithm naturally handles IEEE-754 special cases ( $\pm \infty$ , NaN, $\pm 0$ ) correctly due to the monotonic properties of the bit patterns and the bias applied to exponents.

D. Algorithms

Algorithm 3 (bsortSigned): Handles signed integers by flipping the initial sort direction.
Algorithm 4 & 5 (bsortFloat): Implements the three-stage recursive sort (Sign, Exponent, Mantissa) for floating-point numbers.

3. Key Contributions

Unified Algorithm: A single algorithmic framework that handles signed integers, unsigned integers, and floating-point numbers without requiring type-specific wrappers or out-of-place memory allocation.
In-Place Operation: Unlike many Radix Sort variants that require $O(n)$ auxiliary space, bsort operates in-place with only $O(w)$ stack space (for recursion depth).
Theoretical Proof of Correctness: The paper provides formal proofs (Theorems 1–3) demonstrating that the specific hierarchical sorting order (Sign $\to$ Exponent $\to$ Mantissa) is both necessary and sufficient for correctness.
Complexity Analysis:
- Time Complexity: $O(wn)$ . Since $w$ is constant for a given data type (e.g., 32 or 64), this is effectively linear $O(n)$ .
- Space Complexity: $O(w)$ auxiliary space (stack depth), making it highly memory-efficient.

4. Empirical Results

The author implemented bsort in C++ and compared it against Introsort (C++ std::sort), Spreadsort, and Radix Sort (ska_sort) on a 64-bit Linux system.

Small Word Sizes (e.g., 8-bit char): bsort outperformed all competitors, including hybrid algorithms. This confirms the theoretical advantage when $w$ is small and $n$ is large.
Large Word Sizes (e.g., 64-bit double): bsort's performance degraded relative to hybrid algorithms.
Bottlenecks Identified: Profiling revealed three main causes for performance loss on large data:
1. Branch Misprediction: Random data causes ~50% branch misprediction in the partitioning loop, flushing the CPU pipeline.
2. Stack Pollution/Cache Misses: The rigid recursive structure causes high L1 D-cache miss rates and register spilling, unlike hybrid algorithms that switch to iterative methods for small partitions.
3. Instruction Volume: bsort scans the entire array $w$ times (e.g., 64 times for 64-bit integers), whereas Introsort scans roughly $\log n$ times.

5. Significance and Future Work

Theoretical vs. Practical: The paper successfully bridges the gap between theoretical non-comparison sorting and practical implementation for mixed data types. It proves that a unified, in-place bitwise sort is mathematically sound.
Performance Potential: While currently outperformed by hybrid algorithms on large word sizes due to microarchitectural inefficiencies, bsort demonstrates that the core bitwise mechanism is highly competitive for small data types.
Future Directions: The author suggests transforming bsort into a hybrid algorithm (switching to iterative or cache-friendly methods for small partitions) and utilizing SIMD instructions and branchless operations (conditional moves) to mitigate branch misprediction and improve instruction-level parallelism.

Conclusion

bsort is a significant contribution to sorting theory, offering a unified, memory-efficient solution for diverse numeric types. While its current monolithic recursive implementation faces hardware bottlenecks on large word sizes, it establishes a strong foundation for future high-performance, non-comparison sorting algorithms that could rival or exceed current state-of-the-art hybrid methods once architectural optimizations are applied.