Aquifer: Hierarchical Memory Pooling with CXL and RDMA for MicroVM Snapshots

Aquifer is a novel hierarchical memory pooling system that combines low-latency CXL and cluster-wide RDMA to optimize MicroVM snapshot restoration by employing a hotness-based format, an ownership-based coherence protocol, and a hybrid copy-and-demand paging mechanism, thereby significantly reducing cold-start latency compared to existing solutions.

Original authors: Junliang Hu, Huaicheng Li, Ming-Chang Yang

Published 2026-06-24
📖 5 min read🧠 Deep dive

Original authors: Junliang Hu, Huaicheng Li, Ming-Chang Yang

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

Imagine a massive cloud data center as a giant hotel with thousands of rooms (servers). Each room has a specific amount of "memory" (like a desk space for the computer to work on).

The Problem: The "Empty Desk" Waste
In this hotel, a strange thing happens. Often, a room's CPU (the brain) is fully busy, but the desk space (memory) assigned to it is sitting empty because the guest hasn't used it yet. The hotel owners call this "memory stranding." They are paying for thousands of desks that no one is sitting at. In fact, 25–35% of all the desks in the hotel are wasted.

The Solution: A Shared Workspace
To fix this, the hotel decides to build a giant, shared workspace in the basement where everyone can pull desks from when they need them. This is called "memory pooling."

However, there are two types of shared workspaces, and neither is perfect on its own:

  1. The "Super-Fast" Local Shelf (CXL): This is a shelf right next to your room. You can grab a book from it instantly, like reaching for something on your own nightstand. But, it's small. It can only serve a few rooms in your immediate hallway (a "pod").
  2. The "Huge" Basement Archive (RDMA): This is a massive warehouse in the basement that holds millions of books. It can serve the entire hotel. But, it takes longer to walk down there, get the book, and bring it back. Plus, you have to fill out paperwork (software overhead) every time you go.

The Innovation: Aquifer
The researchers built a system called Aquifer that combines both. It uses the "Super-Fast Shelf" for the books you need right now and the "Huge Basement" for the books you might need later or never need at all.

Here is how Aquifer works, using simple analogies:

1. The "Smart Snapshot" (Cleaning the Book)

When a guest (a MicroVM) leaves a room, the hotel takes a "snapshot" (a photo) of their desk to save it for later.

  • The Old Way: They took a photo of the entire desk, including empty spaces, dust, and blank pages. This wasted huge amounts of storage.
  • Aquifer's Way: They analyzed these snapshots and found that 83% of the desk was just empty space (zeros) and another 60% of the non-empty space was never touched by the guest.
  • The Fix: Aquifer throws away the empty space entirely. It puts the "hot" books (the ones the guest actually used) on the Super-Fast Shelf and the "cold" books (the ones they didn't touch) in the Huge Basement. This saves massive amounts of space.

2. The "Traffic Cop" (Keeping Order)

Since multiple guests might want to read the same "hot" books from the Super-Fast Shelf at the same time, there's a risk of chaos. If one guest changes a page, another might see the old version.

  • The Problem: The Super-Fast Shelf (CXL 2.0) doesn't have a built-in "traffic cop" to tell everyone when a book has been updated.
  • The Fix: Aquifer invented a new rulebook. It treats the "Pool Master" (the librarian) as the only person allowed to write or update the books. Everyone else is just a "borrower" who can only read.
  • The Mechanism: Before a guest grabs a book, they check a counter. If the librarian is trying to update the book, the guest waits. If the book is safe, the guest grabs it, reads it, and then updates the counter to say, "I'm done." This ensures everyone sees the correct version without needing expensive hardware help.

3. The "Pre-Loading" Trick (Speeding Up the Start)

When a guest returns to their room, the hotel needs to set up their desk quickly so they can start working.

  • The Old Way: The guest would sit down, reach for a book, realize it's missing, run to the basement, get it, and come back. This happens for every single book, causing a long delay (the "cold start").
  • Aquifer's Way:
    • Step 1: Before the guest even sits down, the hotel staff runs to the Super-Fast Shelf and copies all the "hot" books directly onto the guest's desk. This happens instantly.
    • Step 2: If the guest reaches for a "cold" book (one that was in the basement), the staff sends a request to the basement in the background while the guest starts working on the books that are already there. The guest never has to wait for the basement delivery to start working.

The Results

The researchers tested this system on a simulated version of their hotel.

  • Speed: Aquifer was 2.2 times faster at getting guests started compared to the standard method (Firecracker).
  • Efficiency: It was also 1.3 times faster than the next best attempt at solving this problem.

In Summary
Aquifer is like a smart hotel manager who realizes that most guests only use a few specific items on their desk. By throwing away the empty space, keeping the most-used items on a super-fast shelf, and pre-loading those items before the guest arrives, they eliminated the long wait times and wasted space that plagued the cloud data center.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →