Nonparametric bounds for vaccine effects in randomized trials

Imagine you are trying to figure out how well a new umbrella works at keeping you dry. You give half a group of people the new umbrella (the vaccine) and the other half a fancy, useless stick (the placebo). You want to know: Does the umbrella actually stop the rain (the virus), or did the people just stop getting wet because they thought they had an umbrella?

This is the core problem of vaccine trials when "blinding" fails.

The Problem: The "Broken Blind"

In a perfect trial, no one knows if they got the real vaccine or the placebo. This is called being "blinded." But sometimes, the real vaccine causes a sore arm or a fever, while the placebo doesn't. Suddenly, people guess, "Oh, I must have the real one!"

Once they guess, their behavior changes. If they think they are protected, they might stop wearing masks, go to crowded parties, or hug more people.

The Immunological Effect: The vaccine's actual biological power to fight the virus.
The Behavioral Effect: The change in behavior because the person thinks they are protected.

If you just look at the final infection rates, you get a messy mix of both. It's like trying to measure how good a car's engine is, but the driver is also driving faster because they think the car is a Ferrari. You can't tell if the speed comes from the engine or the driver's confidence.

The Old Solution vs. The New Solution

Previously, statisticians tried to separate these two effects by making a very strict guess: "There is no hidden personality trait that makes someone both guess they have the vaccine AND take more risks."

The authors of this paper say: "That guess is too strong. It's unrealistic."

Example: An optimistic person might think, "I feel great, I must have the vaccine!" (Guessing correctly). But that same optimism might make them more social and less careful (Behavior).
Because this "hidden personality" (let's call it U) exists, we can't pinpoint the exact number for how well the vaccine works. The exact number is hidden.

The New Idea: Drawing a "Fence" (Bounds)

Instead of trying to find the single, exact number (which is impossible without that strong guess), the authors propose drawing a fence around the answer. They calculate a range (a lower limit and an upper limit) where the true answer must lie.

Think of it like this: You can't tell exactly how much money is in a piggy bank without breaking it. But if you know the bank weighs between 2 and 3 pounds, and you know the coins inside, you can say, "The money is definitely between $50 and $80." You don't know the exact cent, but you have a useful, safe range.

The paper offers two ways to build this fence:

1. The "Math Puzzle" Method (Linear Programming)

Imagine you have a giant jigsaw puzzle where some pieces are missing. You know the shape of the box and a few pieces you do have. You try to fit the missing pieces in every possible way that doesn't break the rules of the puzzle.

You find the arrangement that gives the lowest possible vaccine effectiveness.
You find the arrangement that gives the highest possible vaccine effectiveness.
The truth is somewhere in between. This method is very rigorous but can sometimes give a very wide fence (e.g., "The vaccine is between 0% and 100% effective"), which isn't very helpful.

2. The "Common Sense" Method (Monotonicity)

This method adds a little bit of "common sense" to the math puzzle. It assumes that if a hidden factor (like optimism) makes someone more likely to think they have the vaccine, it probably also makes them more likely to get infected (by taking more risks). It assumes these things move in the same direction.

By adding this reasonable assumption, the fence gets much tighter. Instead of "0% to 100%," you might get "36% to 47%." This is much more useful for doctors and policymakers.

The Real-World Test: The "Sore Arm" Trial

The authors tested their new math on a real COVID-19 vaccine trial (ENSEMBLE2).

The Situation: People who got the vaccine got sore arms (Side Effects). People who got the placebo didn't. This broke the blind.
The Result: The standard calculation said the vaccine was about 39% effective.
The New Calculation:
- Without extra assumptions, the "fence" was wide (not very helpful).
- With the "common sense" assumption (that feeling protected makes people riskier), the fence tightened to 36.5% to 47.0%.
- This confirmed that the vaccine was indeed working, but the "real-world" effectiveness (including behavior changes) was slightly different than the pure biological effect.

Why This Matters

In a world where people often guess their treatment status (because of side effects, news, or rumors), we can't always get a perfect, single number for how well a vaccine works.

This paper gives us a toolbox to say: "We can't know the exact number, but we can be 100% sure the answer is at least X and at most Y." This helps policymakers make safer decisions without needing to pretend that human behavior is perfectly predictable or that hidden personality traits don't exist.

In short: When the blind is broken, don't panic and guess a single number. Instead, build a sturdy fence around the truth using math and common sense, so we know exactly where the answer lies.

Here is a detailed technical summary of the paper "Nonparametric bounds for vaccine effects in randomized trials" by Axelrod, Obolski, and Nevo.

1. Problem Statement

Randomized Controlled Trials (RCTs) for vaccines are typically designed to be blinded to isolate the immunological effect of the vaccine. However, "broken blinding" occurs when participants deduce their treatment assignment (e.g., due to side effects like local reactions), leading to behavioral changes (e.g., risk compensation).

The Core Issue: When blinding is broken, the standard Vaccine Efficacy (VE) estimate confounds the immunological effect with behavioral effects.
Limitations of Existing Methods: Recent work (Stensrud et al., 2024) proposed causal estimands to separate these effects by measuring participants' belief about their treatment status. However, these methods rely on a strong identification assumption: that there are no unmeasured common causes (confounders) affecting both the participant's belief and their risk of infection.
The Gap: Plausible confounders, such as personality traits (e.g., optimism leading to "wishful thinking" about vaccination and increased social exposure), likely violate this assumption. Consequently, point identification of VE is often impossible in real-world scenarios.

2. Methodology

The authors develop nonparametric causal bounds to estimate Vaccine Efficacy without assuming the absence of unmeasured confounding. They utilize two primary mathematical approaches:

A. Causal Framework and Notation

Variables:
- $A$ : Randomized treatment assignment (Vaccine/Placebo).
- $M$ : Message received (Blinded, Told Vaccinated, Told Placebo).
- $B$ : Belief (Does the participant believe they are vaccinated?).
- $S$ : Adverse Events (AEs), observed in the trial.
- $Y$ : Outcome (Infection).
- $U$ : Unmeasured common cause (e.g., personality, physiological frailty).
Estimands: The paper defines various VE measures:
- $VE(m)$ : Immunological effect under message $m$ .
- $VE_M(a)$ : Behavioral effect of the message at treatment level $a$ .
- $VE_T$ : Total effect (real-world scenario where everyone knows their status).

B. Causal Structures (DAGs)

The authors analyze four distinct causal scenarios based on Directed Acyclic Graphs (DAGs):

Point Identification: No unmeasured confounding between Belief ( $B$ ) and Outcome ( $Y$ ).
Figure 2 (Baseline Unidentified): Unmeasured confounder $U$ affects both $B$ and $Y$ . $S$ is unobserved or does not mediate the confounding.
Figure 3a: $U$ affects $B$ and $Y$ , but $S$ is observed and affects $B$ without being affected by $U$ or directly affecting $Y$ .
Figures 3b–3d: Complex scenarios where $U$ affects $B$ , $S$ , and $Y$ , or where $S$ directly affects $Y$ .

C. Derivation Approaches

Linear Programming (LP) Bounds:
- Based on the Balke-Pearl framework.
- Formulates the problem as an optimization task to find the minimum and maximum possible values of the causal estimand, subject to constraints derived from observed data probabilities.
- Uses "potential response types" to map observed data to unobserved counterfactuals.
- Advantage: Sharp (tightest possible) bounds under the stated assumptions.
- Disadvantage: Can be computationally intensive and yield wide intervals if data is sparse.
Monotonicity-Based Bounds:
- Relies on assumptions that the unmeasured confounder $U$ has a monotonic relationship with both the belief ( $B$ ) and the outcome ( $Y$ ).
- Assumes the direction of the effect of the message ( $M$ ) on the outcome is known (e.g., believing one is vaccinated increases risk due to risk compensation).
- Advantage: Often yields significantly narrower bounds than LP methods when assumptions hold.
- Disadvantage: Sensitive to assumption violations; if the monotonicity direction is wrong, bounds can be invalid.

3. Key Contributions

Relaxation of Strong Assumptions: The paper provides a rigorous framework for estimating VE when the "no unmeasured confounding" assumption (required for point identification) is violated.
Dual Methodology: It introduces and compares two distinct bounding strategies (LP and Monotonicity) tailored to vaccine trials with broken blinding and measured beliefs.
Causal Structure Sensitivity: The authors demonstrate how the width and validity of bounds depend critically on the underlying causal structure (specifically, the role of Adverse Events $S$ $S$ and unmeasured confounders $U$ $U$ ).
- They show that observing $S$ only narrows bounds if $S$ is not a confounder itself (Figure 3a). If $S$ is influenced by $U$ or affects $Y$ directly (Figures 3b–3d), $S$ cannot improve the bounds.
Real-World Application: The methods are applied to the ENSEMBLE2 COVID-19 vaccine trial, a case study where broken blinding was suspected due to differential AE rates.

4. Results

Simulation Study:
- Width: Monotonicity-based bounds were consistently narrower than LP-based bounds when assumptions held.
- Robustness: When monotonicity assumptions were violated (e.g., $U$ had opposing effects on belief and risk), the monotonicity bounds became invalid (lower bound > upper bound). LP bounds remained valid but wider.
- Misspecification: Incorrectly assuming a simpler causal structure (e.g., ignoring $S \to Y$ ) led to invalid bounds, particularly for the Total Effect ( $VE_T$ ).
ENSEMBLE2 Trial Application:
- Point Estimates: Under strong assumptions, $VE(0)$ was estimated at 44.5%.
- Bounds:
  - Without monotonicity assumptions, bounds were extremely wide (e.g., $VE(0) \in [-\infty, 97.7\%]$ ), reflecting high uncertainty.
  - With monotonicity assumptions (assuming vaccination belief increases risk), bounds narrowed significantly. For example, $VE(0)$ was bounded between 36.5% and 47.0%.
- Interpretation: The analysis suggests that while the true immunological effect cannot be pinpointed, it is likely positive and potentially higher than the naive estimate, provided behavioral risk compensation is accounted for.

5. Significance

Methodological Advancement: This work bridges the gap between idealized causal inference (point identification) and messy real-world data (unmeasured confounding) in vaccine trials. It offers a "partial identification" strategy that is robust to personality-driven confounding.
Policy Relevance: As vaccine trials increasingly face challenges with blinding (due to side effects or public awareness), this paper provides a toolkit for policymakers to estimate the true protective effect of vaccines, separating it from behavioral changes.
Practical Guidance: The authors recommend using the most conservative causal structure (Figure 3d) when the true mechanism is unknown to avoid invalid bounds, even if it results in wider intervals. They also highlight the utility of monotonicity assumptions for narrowing intervals when domain knowledge supports them.

In summary, the paper establishes that while broken blinding and unmeasured confounding prevent exact calculation of vaccine efficacy, nonparametric bounds can still extract meaningful information about the magnitude and direction of vaccine effects, provided the causal structure is carefully considered and appropriate assumptions are tested.

Nonparametric bounds for vaccine effects in randomized trials

The Problem: The "Broken Blind"

The Old Solution vs. The New Solution

The New Idea: Drawing a "Fence" (Bounds)

1. The "Math Puzzle" Method (Linear Programming)

2. The "Common Sense" Method (Monotonicity)

The Real-World Test: The "Sore Arm" Trial

Why This Matters

1. Problem Statement

2. Methodology

A. Causal Framework and Notation

B. Causal Structures (DAGs)

C. Derivation Approaches

3. Key Contributions

4. Results

5. Significance

More like this

Modeling extremal dependence in multivariate and spatial problems: a practical perspective

Identifying Treatment Effect Heterogeneity with Bayesian Hierarchical Adjustable Random Partition in Adaptive Enrichment Trials

Comparative e-backtests for general risk measures

Estimating the distance at which narwhal (Monodon monoceros)(\textit{Monodon monoceros})(Monodon monoceros) respond to disturbance: a penalized threshold hidden Markov model

Either a Confidence Interval Covers, or It Doesn't (Or Does It?): A Model-Based View of Ex-Post Coverage Probability

Estimating the distance at which narwhal $(\textit{Monodon monoceros})$ respond to disturbance: a penalized threshold hidden Markov model