Infinitesimal Causality

The Big Idea: Causality as a "Tiny Push"

Imagine you are trying to understand how a machine works. Usually, we look at the machine and ask, "What happens if I pull this lever?" (This is an intervention).

Traditional methods for studying cause and effect (like Pearl's do-calculus) treat this like a game of Lego. You take a block out, swap it for another, and see how the picture changes. It's a "before and after" snapshot.

This paper proposes a different way to look at it. Instead of swapping whole blocks, imagine giving the machine a tiny, microscopic nudge. The authors call this Infinitesimal Causality. They ask: If I push this lever just a tiny bit, how does the machine's internal structure wiggle?

They use advanced math (categories and geometry) to describe these "wiggles" as tangent vectors (tiny arrows pointing in a direction). The paper argues that to truly understand cause and effect, especially when there are hidden parts of the machine we can't see, we need to study these tiny nudges, not just the big swaps.

The Three Layers of the Machine

The authors build their theory on three layers, like a sandwich:

The Probability Layer (Markov): This is the basic rulebook. It says, "If I see X, what is the chance of Y?" It handles the randomness and uncertainty.
The Copy/Discard Layer (Frobenius): This is the "classical" part. In the real world, if you have a piece of data (like a temperature reading), you can copy it to write it down twice, or discard it if you don't need it. The paper treats this ability to copy and throw away data as a fundamental mathematical structure.
The Tiny Nudge Layer (Tangent): This is the new layer. It asks: "If I nudge the system, does the ability to copy or discard data stay the same?"

The Main Question: When we nudge the system, does the "copy/discard" structure break? If it breaks, it means there is something hidden (a latent confounder) messing things up.

The "Hidden Glue" Analogy

Imagine two people, Alice and Bob, are both holding a balloon.

Alice blows air into her balloon.
Bob blows air into his balloon.
They look independent.

But, imagine there is a hidden third person (a "confounder") who is secretly blowing air into both balloons at the same time.

In traditional math, if you look at the balloons, you might think they are just coincidentally moving together. But in this paper's "Tiny Nudge" view:

If you nudge Alice's balloon slightly, and then nudge Bob's slightly, the order matters.
If you nudge Alice then Bob, the balloons end up in a slightly different spot than if you nudge Bob then Alice.
This "order matters" effect is called a Lie Bracket (a fancy math term for a "twist" or "residual").

If the balloons are truly independent, the order of nudges doesn't matter (the twist is zero). If there is a hidden person (the confounder) connecting them, the order does matter, and a "twist" appears. The paper says this "twist" is the geometric signature of hidden causes.

The Three New Rules

The paper translates the three famous rules of causal discovery into this "Tiny Nudge" language. Think of these as three tests to see if your understanding of the machine is correct:

The "Ignore It" Rule (Discarding):
- Old way: If a variable doesn't matter, you can delete it from your equation.
- New way: If you nudge the system, does the act of "throwing away" (discarding) a specific piece of data change the result? If the answer is no, then that data is truly irrelevant. If the answer is yes, the data was secretly important.
The "Swap" Rule (Action vs. Observation):
- Old way: Sometimes, watching a variable is the same as controlling it (if you adjust for other things).
- New way: If you nudge the system, does the "copying" of data stay consistent when you switch from watching to controlling? If the "copy" structure breaks during the swap, the rule doesn't hold.
The "Order Doesn't Matter" Rule (Independence):
- Old way: If two things are independent, doing one doesn't affect the other.
- New way: If you nudge Thing A and then Thing B, is the result the same as nudging B then A? If the results are different (a "residual" or "twist" remains), it means there is a hidden connection between them that you haven't accounted for.

Graphs are Just "Maps," Not the "Territory"

A major point of the paper is that causal graphs (the diagrams with arrows) are just one way to draw the machine, not the machine itself.

The Paper's View: The real object is the "Tiny Nudge" structure (the geometry of how data copies and twists).
The Graph's Role: A graph is just a specific "presentation" or "map" of that structure.
The Problem: Different maps (graphs) can describe the same machine. Sometimes, a graph looks perfect, but the "Tiny Nudge" test reveals a hidden twist that the graph missed.

The authors suggest that instead of trying to find the "perfect graph," we should first measure the "Tiny Nudges" and the "Twists." If the twists are zero, a simple graph might work. If the twists are non-zero, we know there is hidden complexity, and we need to look deeper.

Summary

This paper introduces a new way to do causal math. Instead of just swapping blocks in a diagram, it treats interventions as tiny pushes.

It checks if these pushes preserve the ability to copy and discard information.
It measures twists (Lie brackets) to detect hidden connections.
It treats graphs as just one possible drawing of a deeper, geometric reality.

The goal isn't to replace graphs immediately, but to provide a more robust mathematical foundation that can detect hidden causes even when the standard diagrams fail.

Technical Summary: Infinitesimal Causality

Problem Statement
Current categorical treatments of causal inference, such as categorical do-calculus and string-diagram surgery (Fritz and Klingler, 2023; Jacobs et al., 2018), formalize interventions at the level of discrete combinatorial structure. These frameworks explain how to rewrite causal syntax (e.g., cutting wires, conditioning on outputs) but do not address the infinitesimal geometry of how interventions deform statistical manifolds. Specifically, they lack a mechanism to describe how latent confounding manifests as geometric obstructions (e.g., non-vanishing Lie brackets) independent of any specific graph presentation. The paper addresses the need for a framework where interventions are treated as tangent deformations of the copy/discard structure carried by observable variables, allowing for the detection of causal insufficiency through derivative defects rather than graphical rewrites.

Methodology
The paper introduces Infinitesimal Do-Calculus (IDC), a categorical account of infinitesimal causality within Frobenius Markov categories equipped with tangent-bundle semantics. The methodology rests on three conceptual layers:

Markov Structure: Provides the syntax of probabilistic reasoning (observation, conditioning, Bayesian inversion).
Frobenius Structure: Provides the syntax of classical information, where observable variables (specifically sufficient statistics) can be copied, compared, and discarded via a special commutative Frobenius comonoid $(\delta, \varepsilon)$ .
Tangent Structure: Interventions are modeled as vector fields (tangent vectors) that deform the Frobenius copy/discard operations.

The core technical innovation is the definition of Frobenius-derivative defects. Instead of rewriting diagrams, IDC analyzes whether infinitesimal intervention fields preserve the algebraic structure of classical observations. The framework distinguishes between:

Algebraic Frobenius Structure: The categorical copy/discard operations on sufficient statistics.
Geometric Frobenius Integrability: The condition that the distribution of intervention vector fields is involutive (closed under Lie brackets).

The paper defines a category Stat $^\infty$ of regular finite-dimensional statistical models presented by sufficient statistics, and a structural subcategory Stat $^\infty_{SCM}$ where stochasticity is carried by exogenous variables and visible laws are pushforwards of deterministic mechanisms. In this setting, interventions are defined as tangent vectors on the exogenous space, pushed forward to the visible space, avoiding singularities associated with global Radon–Nikodym density ratios.

Key Contributions

Definition of Stat $^\infty$ and Stat $^\infty_{SCM}$ : The paper defines a categorical substrate for statistical models where sufficient statistics carry Frobenius structure and parameter spaces carry tangent structure. It isolates a subcategory where mechanisms are deterministic and stochasticity is exogenous, allowing infinitesimal interventions to be defined on the exogenous tangent bundle before projection.
Categorical Causal Sufficiency: Causal sufficiency is redefined as the coincidence of two conditions: (a) the algebraic Frobenius copying/discarding structure is preserved by interventions, and (b) the distribution of intervention fields is involutive (closed under Lie brackets).
Frobenius-Derivative Defects: The paper introduces $\partial_i \delta_j$ as the basic obstruction to copy-preserving infinitesimal intervention. A non-zero defect indicates that an intervention deforms the copy structure of a variable, signaling a failure of causal sufficiency or the presence of latent confounding.
Three Infinitesimal Intervention Rules: The classical rules of do-calculus are reformulated as equations involving Frobenius structures and tangent fields:
- Rule 1 (Irrelevance): Discarding an irrelevant variable commutes with intervention if the Lie derivative of the counit vanishes ( $L_{v_i}\varepsilon_j = 0$ ).
- Rule 2 (Action/Observation Exchange): Exchanging action for observation preserves the Frobenius coproduct if the Frobenius derivative defect vanishes ( $\partial_i \delta_j = 0$ ) after Kan transport.
- Rule 3 (Independence): Independent interventions close under the visible tangent distribution if the Lie bracket residual vanishes ( $[v_i, v_j] \perp \mathcal{A}_{vis} = 0$ ).
Presentations vs. Foundations: The paper posits that graphical causal models are merely presentations of the underlying Frobenius-Markov object, not the primitive objects of the theory. It argues that finite graphical syntax (like graphs or imsets) is non-conservative, as different presentations can induce isomorphic visible Frobenius algebras.

Results

Theorem 4.7 (Frobenius-acyclicity): On a separated visible stratum, categorical causal sufficiency is equivalent to "tangent Frobenius commutativity," meaning every visible intervention field preserves the Frobenius copy/discard structure, and the visible intervention distribution is involutive.
Theorem 6.7: The three infinitesimal rules (counit invariance, coproduct intertwining, and bracket closure) characterize visible causal sufficiency.
Corollary 6.8: In the flat case (vanishing curvature residuals), the infinitesimal rules reduce to the classical discrete do-calculus transformations.
Theorem 7.3: The assignment from finite graphical presentations to visible Frobenius/tangent data is not conservative; non-isomorphic presentations can induce the same visible algebraic structure.

Significance and Claims
The paper claims to provide the first treatment of do-calculus in a tangent category setting, shifting the focus from graphical surgery to the differential geometry of statistical manifolds. Its primary significance lies in:

Geometric Invariants: Identifying "Frobenius-derivative defects" and non-vanishing Lie brackets as intrinsic geometric signatures of latent confounding, independent of graph selection.
Handling Singularities: By formulating interventions on the exogenous tangent bundle (Stat $^\infty_{SCM}$ ), the framework avoids the support problems and singularities often encountered when defining hard interventions via global density ratios on observed variables.
Foundational Shift: It proposes that causal inference should begin with Frobenius-compatible infinitesimal invariants of the observed statistical object, treating any finite graphical or algebraic syntax (like graphs or imsets) as a secondary presentation of these invariants.

The paper explicitly limits its scope to the formulation of the substrate, the connection to Kan transport, and the statement of the three rules. It acknowledges that full causal-sufficiency theory, the construction of a Frobenius-Markov cohomology theory, and the functorial extraction of graphical presentations from tangent data are left as future work.