Streami: An MPI Data-Parallel Library to Compute Field Lines on GPUs

This paper introduces Streami, an open-source, extensible GPU-accelerated library that interfaces with MPI applications to efficiently compute field lines in fluid flows for both post-hoc and in-situ analysis.

Original authors: Stefan Zellmann, Milan Jaros, Andrea Paris, Ingo Wald, Tatiana von Landesberger

Published 2026-06-03
📖 4 min read☕ Coffee break read

Original authors: Stefan Zellmann, Milan Jaros, Andrea Paris, Ingo Wald, Tatiana von Landesberger

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

Imagine you are trying to visualize the invisible currents of a massive, swirling storm inside a supercomputer. In the world of fluid dynamics, scientists use "field lines" (like streamlines) to draw the paths that tiny particles would take as they ride these currents. It's like dropping a million leaves into a river to see where the water is flowing.

The problem is that these simulations are huge. They run on supercomputers that have dozens of powerful graphics cards (GPUs) working together, split across many different machines. Usually, to draw these lines, you'd have to stop the simulation, copy all that massive data to a separate computer, and then try to draw it. But moving that much data is like trying to pour the entire ocean into a teacup; it's slow, expensive, and creates a bottleneck that stops everything.

Enter "Streami."

Think of Streami as a specialized, high-speed courier service that lives inside the supercomputer itself. Instead of moving the data out, Streami moves the "leaves" (the particles) directly between the different graphics cards that are already holding the data.

Here is how it works, broken down into simple concepts:

1. The "In-Situ" Delivery Service

Most visualization tools are like a delivery service that picks up a package, drives it to a warehouse, sorts it, and then ships it out. Streami is different. It's like a teleportation network built right into the factory floor.

  • The Setup: The supercomputer is divided into neighborhoods (data partitions), with each neighborhood managed by a specific GPU.
  • The Job: Streami lets a particle start in Neighborhood A, move through the flow, and if it crosses the border into Neighborhood B, it instantly "teleports" (via a fast, direct connection) to the GPU managing Neighborhood B.
  • The Benefit: No data ever leaves the supercomputer. The simulation and the visualization happen at the same time, on the same machines, without the slow "truck ride" of copying data.

2. The Two Layers of the Library

The paper describes Streami as having two "languages" or layers:

  • The Low-Level Layer (The Engine): This is the heavy machinery written in a very fast, technical language (CUDA/C++). It's the part that actually calculates the math for every single particle, checks which neighborhood it's in, and handles the instant teleportation between computers. It's designed to be as fast as physically possible, using "templates" so it can adapt to different types of data grids without slowing down.
  • The High-Level Layer (The Dashboard): This is the user-friendly interface (written in C++). It's like the steering wheel and dashboard of a car. Scientists don't need to know how the engine works; they just tell the dashboard, "Draw me a stream of particles starting here," and the dashboard handles the complex math and communication behind the scenes.

3. Handling Different Terrains

Fluid simulations can be messy. Sometimes the data is a neat, uniform grid (like a checkerboard). Other times, it's a chaotic, jumbled mesh of shapes (like a pile of rocks).

  • Streami is extensible. It has a "universal translator" that can understand both the neat checkerboard grids and the messy rock piles.
  • If a scientist has a new, weird type of data, they can plug it into Streami's low-level engine without having to rebuild the whole system. The library figures out how to navigate the specific terrain of that data.

4. Real-World Testing

The authors tested Streami on a cluster with 16 powerful GPUs. They tracked 100,000 particles moving through a simulated galaxy.

  • The Result: The system was incredibly fast, taking only about 1 to 2 milliseconds to move all particles one step forward.
  • The Bottleneck: The only thing that slowed it down slightly was the "phone call" between the different computers (MPI communication) to say, "Hey, this particle is now in your neighborhood." Even then, it was very efficient.

Summary

In short, Streami is a tool that allows scientists to draw flow lines (like wind or water currents) directly inside a massive supercomputer while the simulation is running. It avoids the slow, painful process of copying huge amounts of data. Instead, it acts as a seamless bridge, letting particles hop instantly between different graphics cards, making it possible to visualize complex, massive fluid flows in real-time or near real-time.

The authors have made this tool open-source, meaning anyone can use it to build their own "interactive seed point placement" apps (where you can click and drop virtual leaves into a simulation to see where they go) or integrate it into their own scientific workflows.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →