Time-to-Event Modeling with Pseudo-Observations in Federated Settings

Imagine a group of doctors from different hospitals across a city who want to answer a big question: "What factors make children more likely to become obese, and how does that risk change as they grow up?"

To get a clear answer, they need to look at data from thousands of kids. But there's a problem: Privacy laws (like HIPAA) say they cannot send the private medical records of individual children to a central computer. They can't mix their data together like pouring different cups of water into one big bucket.

This paper introduces a clever new way to solve this puzzle without ever sharing the private "cups of water."

The Old Way vs. The New Way

The Old Way (The "Shared Blueprint" Problem):
Previously, if hospitals wanted to work together, they often had to share a list of exactly when specific events happened (e.g., "Patient A got sick on Tuesday, Patient B on Friday"). Even though they didn't share names, sharing these specific dates could sometimes reveal sensitive details about the patients. Also, older methods often assumed that the risk factors stayed the same forever (like saying "being overweight is always twice as dangerous"), which isn't always true in real life.

The New Way (The "Ghostly Summaries" Approach):
The authors created a method called Federated Survival Analysis with Site-Level Heterogeneity Adjustment. That's a mouthful, so let's break it down with an analogy.

1. The "Ghostly Summaries" (Pseudo-Observations)

Instead of sending patient records, each hospital calculates a "summary score" for every single patient based on a shared, anonymous map of the overall situation.

The Analogy: Imagine every hospital has a local map of their neighborhood. They all agree to use a giant, city-wide map (the Federated Kaplan-Meier estimator) that shows the general traffic patterns.
Using this city map, each hospital calculates a "ghostly score" for their local patients. This score tells them, "Based on the city's traffic, this specific patient is likely to encounter a traffic jam at 2 PM."
They send these scores (not the patient's name or exact time) to the central team. The central team can now see the patterns without ever seeing the actual cars or drivers.

2. The "One-Shot" Conversation

Usually, computers in these networks have to talk back and forth many times to get the answer right (like a game of "Hot and Cold"). This new method is a "One-Shot" approach.

The Analogy: Instead of a long phone call, every hospital sends their "ghostly scores" and a few summary numbers in one single email. The central computer puts them together instantly to get the final answer. It's fast, efficient, and keeps the data secure.

3. The "Flexible Lens" (No Rigid Rules)

Old methods forced everyone to agree that risks never change over time. This new method is flexible.

The Analogy: Imagine looking at a tree through a rigid, square window. You only see a square slice of the tree. This new method uses a flexible, zoomable lens. It can see that a risk factor (like age) might be very dangerous when a child is 5, but less dangerous when they are 10. It captures the story of how risk changes over time, not just a single static number.

4. The "Smart Noise Filter" (Heterogeneity Adjustment)

Sometimes, one hospital might have a weird result just because of bad luck or a small sample size (noise), while another hospital might have a real unique difference because their patients are different (signal).

The Analogy: Imagine a choir. Most singers are singing the same note (the global truth). One singer is slightly off-key because they are nervous (noise). Another singer is intentionally singing a different harmony because it's a jazz song (real local difference).
The authors built a "Smart Noise Filter." It listens to the choir. If a singer is just slightly off-key due to nervousness, the filter gently nudges them back to the main note. But if a singer is intentionally singing a jazz harmony, the filter says, "Ah, that's a real difference! Let's keep it."
This ensures the final result isn't ruined by random errors, but it also doesn't ignore genuine local differences.

The Real-World Test

The team tested this on data from 45,000 children across four hospitals in Chicago (the CAPriCORN network).

The Result: Their new method produced answers almost identical to what you would get if all 45,000 records were magically combined in one place (which is illegal).
The Discovery: They found that while being overweight is a big risk factor, its impact changes over time. Also, the "Smart Noise Filter" successfully identified that one hospital had a unique pattern for a specific health condition, while smoothing out random errors in the others.

Why This Matters

This paper gives researchers a privacy-preserving superpower. It allows hospitals to collaborate on life-saving research without breaking privacy laws. It's like allowing a group of people to solve a giant jigsaw puzzle together without ever showing each other their individual pieces—instead, they just share the shapes of the edges, and the picture appears perfectly clear.

Here is a detailed technical summary of the paper "Federated Survival Analysis with Site-Level Heterogeneity Adjustment" by Jang et al.

1. Problem Statement

In multi-center clinical research, the analysis of time-to-event (survival) data is often hindered by privacy regulations that prohibit the pooling of individual-level patient records. While federated learning (FL) offers a solution, existing federated survival methods face significant limitations:

Reliance on Proportional Hazards (PH): Most methods, such as the One-shot Distributed Algorithm for Cox (ODAC), assume the PH model holds, limiting their ability to model time-varying effects.
Privacy Risks: Some methods require sharing sensitive survival information (e.g., unique event times or risk sets) to construct surrogate likelihoods, which can still leak information about the underlying survival process.
Computational Burden: Cryptographic approaches (e.g., secure multi-party computation) are often computationally prohibitive.
Site Heterogeneity: Existing models often assume common regression coefficients across all sites, failing to account for genuine site-specific variations in patient populations or clinical practices, while lacking mechanisms to distinguish these from noise.

2. Methodology

The authors propose a one-shot federated framework based on pseudo-observations and renewable Generalized Estimating Equations (GEE). The methodology consists of three main components:

A. Federated Pseudo-Observation Construction

Instead of sharing raw data or event times, the framework utilizes a sequentially updated Kaplan-Meier (KM) estimator and its influence function.

Global Estimation: A global KM estimator $\hat{S}(t)$ and its influence function $\hat{\psi}(t)$ are computed and broadcast to all sites.
Local Approximation: Each site $k$ computes pseudo-observations $\tilde{S}_{ij}$ for subject $i$ at landmark times $t_j$ using the approximation:
$\tilde{S}_{ij} \approx \hat{S}(t_j) + \hat{\psi}_i(X_i, \Delta_i)(t_j)$
This avoids the computational infeasibility of leave-one-out jackknife resampling in large datasets while preserving privacy, as no individual-level data leaves the site.

B. Renewable GEE Regression

The pseudo-observations are treated as continuous outcomes in a Generalized Linear Model (GLM) framework.

Link Functions: The framework supports flexible link functions tailored to the target estimand.
- Complementary log-log (cloglog): Yields log-hazard ratios (HRs), accommodating both proportional and non-proportional hazards.
- Logit: Yields odds ratios.
Renewable Estimation: A "one-shot" iterative process updates the global estimate sequentially across sites ( $k=1$ to $K$ ) without pooling data. Site $k$ updates the estimate from Site $k-1$ using its local score vector and Hessian matrix.
Variance Estimation: A robust sandwich variance estimator is used to account for the within-subject correlation induced by multiple pseudo-observations per subject.

C. Site-Level Heterogeneity Adjustment (Debiasing)

To address site-specific variations, the authors introduce a covariate-wise debiasing procedure using a "fit-and-adjust" strategy:

Global Fit: A global federated model is fitted first.
Deviation Calculation: Site-specific deviations $\Delta_k = \hat{\beta}^{(k)} - \hat{\beta}_{glob}$ are calculated.
Soft-Thresholding: A variance-adaptive soft-thresholding rule is applied to shrink deviations toward zero (the global estimate) if they are small relative to their uncertainty, while preserving large, genuine deviations.
$\hat{\delta}_k(\tau) = \text{sign}(\Delta_k) \left( |\Delta_k| - \tau \sqrt{V(\Delta_k)} \right)_+$
Threshold Selection: The tuning parameter $\tau$ is selected using Generalized Stein's Unbiased Risk Estimate (GSURE), which accounts for the correlation between local and global estimators, ensuring an optimal bias-variance trade-off.

3. Key Contributions

Flexibility Beyond PH: Unlike ODAC and other one-shot Cox methods, this framework does not require the proportional hazards assumption. It can directly model time-varying coefficients and estimate survival probabilities.
Enhanced Privacy: It avoids sharing sensitive survival data (like unique event times) by relying on influence functions and pseudo-observations derived from a global KM estimator.
Heterogeneity Handling: The novel debiasing procedure explicitly models site-level heterogeneity. It adaptively balances global stability with the preservation of genuine local signals, distinguishing between noise and true site-specific effects.
Statistical Validity: The integration of a robust sandwich variance estimator ensures valid statistical inference in the presence of clustered observations and repeated measures.

4. Results

The framework was evaluated through simulation studies and a real-world application:

Simulation (PH Assumption): Under proportional hazards, the proposed method achieved inferential accuracy comparable to the oracle pooled Cox model and ODAC, with negligible bias across various event rates and site configurations.
Simulation (Non-PH Assumption): When PH was violated (time-varying effects), the method successfully recovered time-varying coefficient trajectories, whereas traditional Cox-based federated methods would fail.
Simulation (Heterogeneity): In scenarios with sparse site heterogeneity, the debiasing procedure significantly outperformed both purely global (pooled) and purely local estimators. It minimized the Root Mean Square Error (RMSE) by shrinking noise-driven deviations while retaining true site-specific effects.
Real-World Application (CAPriCORN): Applied to pediatric obesity data ( $N=45,865$ $N = 45, 865$ ) from four Chicago-area hospitals:
- The federated estimates closely matched the centralized pooled analysis.
- The model successfully identified time-varying effects for age and BMI percentile, which violated the PH assumption in the pooled data.
- The debiasing step effectively filtered noise in site-specific estimates (e.g., for comorbidity) while preserving meaningful local signals.

5. Significance

This paper presents a robust, privacy-preserving solution for collaborative survival research in the era of real-world data. By moving beyond the rigid proportional hazards assumption and addressing the critical challenge of site-level heterogeneity, the proposed framework enables:

Scalable Collaboration: Multi-center studies can be conducted without transferring sensitive patient records.
Complex Modeling: Researchers can model time-varying risks and non-proportional hazards, which are common in real-world clinical settings but often ignored in standard federated approaches.
Precision Medicine: The ability to distinguish between noise and genuine site-specific variations allows for more nuanced, localized clinical insights while maintaining the statistical power of a large pooled cohort.

The method represents a significant advancement in federated learning for survival analysis, offering a flexible alternative to traditional pooled analyses and existing one-shot distributed algorithms.

Time-to-Event Modeling with Pseudo-Observations in Federated Settings

The Old Way vs. The New Way

1. The "Ghostly Summaries" (Pseudo-Observations)

2. The "One-Shot" Conversation

3. The "Flexible Lens" (No Rigid Rules)

4. The "Smart Noise Filter" (Heterogeneity Adjustment)

The Real-World Test

Why This Matters

1. Problem Statement

2. Methodology

A. Federated Pseudo-Observation Construction

B. Renewable GEE Regression

C. Site-Level Heterogeneity Adjustment (Debiasing)

3. Key Contributions

4. Results

5. Significance

More like this

Modeling extremal dependence in multivariate and spatial problems: a practical perspective

Identifying Treatment Effect Heterogeneity with Bayesian Hierarchical Adjustable Random Partition in Adaptive Enrichment Trials

Comparative e-backtests for general risk measures

Estimating the distance at which narwhal (Monodon monoceros)(\textit{Monodon monoceros})(Monodon monoceros) respond to disturbance: a penalized threshold hidden Markov model

Either a Confidence Interval Covers, or It Doesn't (Or Does It?): A Model-Based View of Ex-Post Coverage Probability

Estimating the distance at which narwhal $(\textit{Monodon monoceros})$ respond to disturbance: a penalized threshold hidden Markov model