pathsig: A GPU-Accelerated Library for Truncated and Projected Path Signatures

The paper introduces **pathsig**, a PyTorch-native, GPU-accelerated library that significantly outperforms existing tools in computing and training with truncated and projected path signatures by leveraging parallel CUDA kernels for high throughput and minimal memory usage.

Tobias Nygaard

Published 2026-03-02
📖 5 min read🧠 Deep dive

Imagine you are trying to describe a complex journey to a friend. You could just say, "We went from A to B," but that misses the story. Did you take a scenic route? Did you stop for coffee? Did you zigzag through traffic?

In the world of machine learning, Path Signatures are a mathematical tool designed to tell that full story. They turn a messy, winding line of data (like a stock price, a heartbeat, or a robot's movement) into a rich, detailed "fingerprint" that a computer can understand.

However, calculating these fingerprints is like trying to count every single grain of sand on a beach while running a marathon. It's incredibly slow, and it eats up a lot of memory. Existing tools were like trying to do this with a spoon.

Enter pathsig, a new library introduced by Tobias Nygaard. Think of pathsig as a high-speed, GPU-powered vacuum cleaner that sucks up all that data instantly, leaving you with a clean, compact summary.

Here is a breakdown of how it works, using everyday analogies:

1. The Problem: The "Library of Babel"

Imagine the signature of a path as a massive library containing every possible story you could tell about that journey.

  • The Old Way: To get the story, you had to walk through every single aisle of the library, read every book, and write down a summary. If you wanted to learn from this (like training a neural network), you had to walk back through the library in reverse to see what you missed. This was slow and exhausting.
  • The New Way (pathsig): Instead of walking, pathsig uses CUDA (the brain of modern graphics cards) to send out thousands of tiny robots (threads) simultaneously. Each robot grabs a specific set of books, summarizes them, and hands them back instantly.

2. The Secret Sauce: "Prefix-Closed" Groups

How does pathsig organize this chaos? It uses a clever trick called prefix-closed sets.

Imagine you are organizing a family tree.

  • The Old Way: You might try to organize by "Great-Grandparents," then "Grandparents," then "Parents." But to understand a parent, you need to know their parents. It gets messy.
  • The pathsig Way: It groups people by family branches. If you are looking at a specific branch (a "word"), pathsig automatically gathers everyone in that branch's history (the "prefixes") and processes them together.
  • The Analogy: It's like a construction crew. Instead of one person building a whole house from scratch, they assign a team to build the foundation, then the walls, then the roof, all in perfect sync. Because the GPU can do thousands of these teams at once, the house gets built in seconds.

3. The "Memory" Trick: Not Storing Everything

One of the biggest headaches in AI is running out of memory (RAM).

  • The Old Way: To calculate the journey, the computer would write down every single step of the path on a giant whiteboard. If the path was long, the whiteboard would overflow, and the computer would crash.
  • The pathsig Way: It uses a "magic eraser." It only keeps the final result of the journey. When it needs to figure out the past (for learning), it mathematically "rewinds" the tape using the final result and the rules of the journey, rather than looking at a stored list of every step.
  • The Result: You can analyze massive datasets on a single graphics card without the computer screaming for more memory.

4. Customizing the Lens: Projections

Sometimes, you don't need the whole library. You only need the chapters about "weather" or "traffic."

  • Truncation (The Old Standard): This is like saying, "I only want the first 5 chapters of every book." It's simple, but you might miss a crucial plot point in chapter 6.
  • Projections (The pathsig Superpower): This is like saying, "I only want the chapters about rain and cars, regardless of which chapter they are in."
    • Anisotropic Truncation: Imagine some parts of your journey are smooth (like a highway) and some are bumpy (like a dirt road). pathsig lets you treat the smooth parts with a coarse summary and the bumpy parts with a detailed one, saving time without losing important details.

5. Real-World Impact: The "Lead-Lag" Example

The paper shows a practical example using financial data (predicting how "rough" or smooth a market is).

  • They had a "Lead-Lag" transformation (a way of looking at how one asset moves before another).
  • The standard method was like taking a photo of the whole city and trying to find one specific car.
  • pathsig's "Sparse Projection" was like using a drone to zoom in only on the specific car and its immediate surroundings.
  • The Outcome: They got better accuracy (lower error) while using 6 times less data and finishing the training 2 times faster.

Summary

pathsig is a tool that makes the complex math of "Path Signatures" fast enough to use in modern AI.

  • It's Fast: It uses the power of graphics cards to do calculations in parallel, making it 10 to 30 times faster than previous tools.
  • It's Lean: It uses very little memory, allowing you to process huge datasets without crashing your computer.
  • It's Flexible: It lets you pick and choose exactly which parts of the data story you want to tell, rather than forcing you to read the whole book.

In short, pathsig turns a slow, heavy, manual process into a lightning-fast, automated assembly line, making it possible to teach AI to understand complex, moving data like never before.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →