Bias and Variance of Adjusting for Instruments

This paper's simulation demonstrates that within the framework of large-scale propensity score adjustment, including instruments with a treatment correlation below 0.5 and an equipoise preference score above 0.5 introduces only minor bias, supporting the strategy of adjusting for many covariates rather than attempting to identify a limited set of confounders.

Original authors: Hripcsak, G., Anand, T., Chen, H. Y., Zhang, L., Chen, Y., Suchard, M. A., Ryan, P. B., Schuemie, M. J.

Published 2026-03-15
📖 5 min read🧠 Deep dive

Original authors: Hripcsak, G., Anand, T., Chen, H. Y., Zhang, L., Chen, Y., Suchard, M. A., Ryan, P. B., Schuemie, M. J.

Original paper licensed under CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/). ⚕️ This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

Imagine you are a detective trying to solve a mystery: Does Drug A actually help patients recover, or does it just look that way because of other factors?

In the real world, we can't run perfect experiments where we randomly assign people to take the drug or a placebo (that would be unethical or impossible for many conditions). Instead, we look at "observational data"—records of what people actually did.

The problem is Confounding.

  • The Scenario: Maybe people who take Drug A are generally healthier to begin with. If they recover faster, is it the drug, or was it their good health?
  • The Solution (Propensity Score): To fix this, statisticians use a "matchmaker" tool called a Propensity Score. It looks at hundreds of details about a patient (age, weight, other meds, past history) to create a "score" that says, "This person looks very similar to someone who took the drug, and this person looks similar to someone who didn't." By comparing these matched groups, we hope to isolate the drug's true effect.

The Big Debate: "More is Better" vs. "Less is More"

For decades, researchers have argued about which details to feed into this matchmaker tool:

  1. The "Pick and Choose" Team: "Only include the obvious suspects (confounders). If we include too many variables, we might mess things up."
  2. The "Throw Everything In" Team (LSPS): "Include every piece of data we have before the treatment started. Let the computer figure out what matters."

The Fear: The "Pick and Choose" team worries about Instruments.
An Instrument is a sneaky variable. It influences whether someone gets the drug, but it has zero effect on whether they get better or worse.

  • Analogy: Imagine a doctor who always prescribes Drug A to patients who live in a specific zip code. The zip code is an "instrument." It predicts the drug, but living in that zip code doesn't make you healthier.
  • The Worry: If you accidentally include the "zip code" in your matchmaker tool, you might distort the results, making the drug look worse or better than it really is.

What This Paper Did: The "Tug-of-War" Simulation

The authors (a team of medical data scientists) ran a massive computer simulation to settle this debate. They wanted to see: If we accidentally include a "sneaky" instrument in our matchmaker tool, how much does it actually hurt our results?

They set up a scenario with:

  1. A Real Confounder: A factor that messes up the results (like "good health").
  2. A Sneaky Instrument: A factor that only predicts the drug (like the "zip code").
  3. The Test: They ran the simulation thousands of times, making the "zip code" influence the drug choice more and more strongly, and watched what happened to the final answer.

The Surprising Discovery

The results were counter-intuitive but very reassuring:

1. The "Noise" vs. The "Signal"
Even when the "sneaky instrument" was 20 times stronger at predicting who got the drug than the "confounder" was at messing up the results, including it in the model did not ruin the answer.

  • Analogy: Imagine you are trying to hear a whisper (the drug's true effect) in a noisy room.
    • The Confounding is a loud, distracting shout that drowns out the whisper.
    • The Instrument is a static hiss in the background.
    • The old fear was: "If we add a microphone to filter out the static, we might accidentally amplify the shout!"
    • The Reality: The authors found that even if the static hiss is incredibly loud, turning on the microphone (adjusting for the instrument) only adds a tiny bit of extra noise. It doesn't drown out the whisper nearly as much as the original shout (the unadjusted confounding) does.

2. The Safety Net (Diagnostics)
The paper also looked at the safety rules used by the "Throw Everything In" team (LSPS). They have two "stop signs":

  • The Correlation Check: If a variable is too strongly linked to the drug (like a correlation of 0.5 or higher), stop and check it.
  • The Equipoise Check: This measures if the groups are balanced. If the "matchmaker" is struggling to find matches, it's a red flag.

The simulation showed that as long as these safety checks are in place, the "sneaky instruments" that slip through are too weak to cause significant damage.

The Bottom Line

Don't be afraid to cast a wide net.

The study concludes that in the real world, it is much more dangerous to miss a real confounder (by trying to be too picky) than it is to accidentally include a weak instrument.

  • The Old Way: Trying to manually pick the "perfect" list of variables is like trying to find a needle in a haystack by only looking at the top layer. You might miss the needle.
  • The New Way (LSPS): Dumping the whole haystack into a sieve (using all data) and letting the computer filter it is safer. Even if a few pieces of straw (instruments) get through, they don't ruin the soup. The "safety checks" (correlation and equipoise) ensure that the really bad straw gets caught.

In short: When trying to figure out if a treatment works, it's better to be inclusive and let the data speak, rather than being overly cautious and accidentally ignoring the factors that actually matter. The "noise" of instruments is manageable; the "silence" of missing confounders is fatal to the study.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →