Operationalizing Longitudinal Causal Discovery Under Real-World Workflow Constraints

Imagine you are trying to figure out the recipe for a perfect cake, but you don't have the recipe book. Instead, you only have a logbook of every time someone in a bakery tried to bake a cake over the last four years. You want to know: Did adding extra sugar actually make the cake fluffier, or did the baker just happen to use a better oven that day?

This is the challenge of Causal Discovery. For years, scientists have built powerful computers to solve this puzzle. But when they tried to use these computers on real-world data (like hospital records or national health surveys), the results were often messy, confusing, or wrong.

Why? Because real life isn't a clean, abstract math problem. It's a messy workflow.

The Problem: The "Calendar" vs. The "Kitchen"

In a perfect math world, time is just a number: 1, 2, 3. But in the real world, time is a schedule.

Imagine a bakery where:

You weigh the ingredients (Time A).
You mix the batter (Time B).
You put it in the oven (Time C).

If you just look at the data without knowing the rules, the computer might get confused. It might think the oven heat caused the mixing, or that the mixing happened after the cake was baked, simply because the data was recorded in a weird order.

In the real world, hospitals and governments have strict workflows. A doctor measures your blood pressure, then decides if you need a lifestyle guide, then measures your blood pressure again next year. If a computer doesn't know this "script," it tries to guess every possible order of events, leading to a huge mess of wrong answers.

The Solution: The "Workflow GPS"

The authors of this paper realized that instead of building a smarter computer, they needed to give the existing computers a GPS map of the real-world workflow.

They didn't invent a new algorithm; they just taught the old one the rules of the game. Think of it like this:

Old Way: "Here is a pile of data. Guess the cause-and-effect. (Good luck!)"
New Way: "Here is the data. But remember: The doctor always checks the blood pressure before giving the advice. The advice never happens before the check-up. Also, the lifestyle habits are recorded as a summary of the whole year, so we can't tell which came first within that year. Now, guess the cause-and-effect."

By adding these "Workflow Rules," the computer stops guessing impossible scenarios and focuses only on the ones that make sense in the real world.

The Experiment: Japan's National Health Checkups

To test this, the authors used a massive dataset from Japan: 107,000 people checked up on over four years.

They wanted to see if the government's "Health Guidance" program (telling people to exercise and eat better) actually improved their health.

What they found:

It works, but slowly: The guidance helped people lose weight (BMI) and lower blood pressure, but the biggest effect was seen immediately in the first year. The effects got smaller and fuzzier in later years.
Uncertainty is key: Instead of just giving a single number (e.g., "It lowers blood pressure by 5 points"), they used a technique called Bootstrap (imagine running the experiment 1,000 times with slightly different groups of people) to say, "We are 95% sure the effect is between 3 and 7 points." This is crucial for doctors and policymakers who need to know how reliable the advice is.
The "Motif" (The Pattern): Even though the data was huge and complex, a simple, repeating pattern emerged. The guidance consistently influenced weight and blood pressure in a specific way, regardless of the year.

The "Simulator" (The Crystal Ball)

The coolest part of this paper is what they built with the results. They turned the complex math into a simple "What-If" Simulator for doctors.

Forward Prediction: A doctor can ask, "If I tell this patient to stop smoking now, what will their blood pressure look like in two years?" The simulator gives an answer with a confidence range.
Goal Seeking: A doctor can ask, "I want this patient's blood pressure to be 120 in two years. What do they need to change today to make that happen?"

Why This Matters

This paper is a bridge. For a long time, "Causal AI" was like a Ferrari that could only drive on a perfect race track (clean, theoretical data). This paper puts tires on the Ferrari so it can drive on the bumpy, muddy roads of real life.

It teaches us that to understand cause and effect in the real world, you don't just need better math; you need to understand the story of how the data was collected. By respecting the workflow, we can finally trust the computer's advice to make real-world decisions that improve our health.

1. Problem Statement

Despite significant theoretical progress in causal discovery (e.g., LiNGAM, PC-algorithm), deploying these methods in large-scale, real-world longitudinal systems remains challenging. The primary obstacle is the "deployment gap":

Workflow vs. Abstract Time: Real-world data (e.g., annual health screenings) are generated by institutional workflows, not abstract time indices. Variables are recorded, interventions assigned, and outcomes measured according to specific protocols.
Structural Ambiguity: When these workflow-induced partial orders are not formalized, the admissible Directed Acyclic Graph (DAG) space becomes excessively large. This includes structures inconsistent with the recording process, leading to weak identification of within-time causal orientations, especially in mixed discrete-continuous panels.
Lack of Operational Utility: Standard causal discovery outputs often stop at graph estimation, lacking the uncertainty quantification and dynamic representations required for decision support in operational settings.

2. Methodology

The authors propose a design layer that sits between raw data and the causal discovery algorithm. Rather than modifying the optimization algorithm itself, they redefine the admissible graph class by encoding workflow constraints.

Core Principles

Workflow-Derived Structural Constraints:
- Institutional ordering (e.g., screening $\to$ guidance $\to$ outcome) is encoded as a structural mask.
- This mask restricts admissible edges based on recording protocols (e.g., no time reversal, forward cross-time links restricted to $t-1 \to t$ ) rather than subjective medical assumptions.
- It creates a strict subset of the unconstrained DAG space ( $G_{workflow} \subset G_{unconstrained}$ ).
Timeline-Aligned Block Design:
- Time Alignment: Time points are aligned with evaluation schedules (e.g., guidance in year $y$ affects outcomes in $y+1$ ).
- Block Structure: Within each time point, variables are grouped into ordered blocks reflecting the recording resolution:
  - Block 1: Intervention/Exposure (e.g., Health Guidance).
  - Block 2: Discrete indicators (Medication, Lifestyle).
  - Block 3: Continuous outcomes (BMI, BP, etc.).
- Directed edges are permitted only in directions consistent with this block order, reducing orientation instability in mixed-type panels.
Bootstrap Uncertainty Quantification:
- Instead of point estimates, the framework uses subject-level bootstrap resampling ( $B=1000$ ) to quantify uncertainty in lagged total effects.
- This provides empirical confidence intervals for decision-relevant quantities (e.g., the total effect of guidance on BMI over 1–3 years).
Dynamic Representation for Decision Support:
- The learned structure is recast as a linear dynamic system supporting:
  - Forward Simulation: "What-if" analysis (predicting future outcomes given current changes).
  - Inverse Target-Setting: Calculating the upstream changes required to achieve a specific downstream target.

Algorithmic Implementation

Base Algorithm: Longitudinal LiNGAM (Linear Non-Gaussian Acyclic Model).
Constraints: Applied via a prior-knowledge mask during the DirectLiNGAM iterative procedure.
Model Formulation:
$x(t) = \alpha(t,t)v(t) + B(t,t)x(t) + B(t,t-1)x(t-1) + C(t,t)z(t) + C(t,t-1)z(t-1) + e(t)$
Where $x(t)$ are endogenous health outcomes, $v(t)$ is the intervention, and $z(t)$ are observed exogenous inputs (demographics, lifestyle).

3. Key Contributions

Formalization of Workflow Constraints: The paper characterizes a class of structural constraints derived from institutional protocols, proving that explicitly encoding these partial orders reduces structural ambiguity without requiring domain-specific medical edge specifications.
Operational Framework: It bridges the gap between theoretical causal discovery and operational deployment by integrating uncertainty quantification and dynamic simulation directly into the discovery pipeline.
Reproducible Bridge: The approach allows for reproducible causal discovery in settings where "calendar time" does not perfectly align with "causal time" due to administrative delays or batch processing.

4. Results (Case Study)

The framework was applied to a nationwide Japanese health screening cohort:

Data: 107,261 individuals, 429,044 person-years, 15 variables over 4 years.
Key Findings:
- Lagged Total Effects: Health guidance showed a significant negative total effect on BMI at lag 0 (immediate post-guidance year) and lag 1, with effects attenuating and uncertainty increasing at longer lags.
- Blood Pressure: Systolic Blood Pressure (SBP) showed a significant negative effect at lag 0. Diastolic Blood Pressure (DBP) showed a shift from non-significant to positive effects at longer lags, suggesting complex mediated pathways.
- Robustness: Sensitivity analyses using alternative body composition measures (waist circumference, body weight) and rule-based assignment indicators (vs. actual participation) preserved the main qualitative patterns (short-term reduction in adiposity).
- Graph Structure: The learned graphs exhibited recurring within-time substructures (motifs) across time points, providing a compact, interpretable summary of multivariate dependencies.

5. Significance and Implications

From Algorithms to Infrastructure: The paper argues that causal discovery in operational systems requires separating algorithmic foundations from constraint design. By formalizing workflow constraints, the method improves structural interpretability under standard identifiability assumptions.
Decision Readiness: Unlike traditional studies that stop at graph estimation, this framework outputs decision-relevant summaries (total effects with confidence intervals) and a simulator for "what-if" scenarios, making the results actionable for public health policymakers.
Generalizability: While demonstrated in healthcare, the framework applies to any longitudinal system where institutional workflows induce partial orders over variables and time (e.g., education, finance, manufacturing).
Mitigation of Subjectivity: By relying on recording protocols rather than expert medical assumptions for edge constraints, the method enhances auditability and transferability across different domains.

In summary, this work provides a rigorous, reproducible methodology for deploying causal discovery in real-world, workflow-driven environments, transforming abstract causal graphs into operational tools for decision support.

Operationalizing Longitudinal Causal Discovery Under Real-World Workflow Constraints

The Problem: The "Calendar" vs. The "Kitchen"

The Solution: The "Workflow GPS"

The Experiment: Japan's National Health Checkups

The "Simulator" (The Crystal Ball)

Why This Matters

1. Problem Statement

2. Methodology

Core Principles

Algorithmic Implementation

3. Key Contributions

4. Results (Case Study)

5. Significance and Implications

More like this

NS-RGS: Newton-Schulz based Riemannian gradient method for orthogonal group synchronization

Poisson-response Tensor-on-Tensor Regression and Applications

Virtual Dummies: Enabling Scalable FDR-Controlled Variable Selection via Sequential Sampling of Null Features

Eliciting core spatial association from spatial time series: a random matrix approach

Regularized estimation for highly multivariate spatial Gaussian random fields