Comparison of methods for assessing effects of risk factors on disease progression in Mendelian randomization under index event bias

This paper evaluates statistical methods for mitigating index event bias in Mendelian randomization studies of disease progression, finding that while no single approach is universally effective, a strategic framework based on data availability and biological context can guide method selection.

Original authors: Zhang, L., Higgins, I. A., Dai, Q., Gkatzionis, A., Quistrebert, J., Bashir, N., Dharmalingam, G., Bhatnagar, P., Gill, D., Liu, Y., Burgess, S.

Published 2026-03-02
📖 6 min read🧠 Deep dive

Original authors: Zhang, L., Higgins, I. A., Dai, Q., Gkatzionis, A., Quistrebert, J., Bashir, N., Dharmalingam, G., Bhatnagar, P., Gill, D., Liu, Y., Burgess, S.

Original paper licensed under CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/). ⚕️ This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

The Big Picture: The "Survivor" Problem

Imagine you are a detective trying to figure out why some people get sick and stay sick, while others get sick and recover quickly. You want to know if a specific habit (like eating too much sugar) causes the sickness to get worse.

To solve this, you decide to look at a group of people who already have the disease. You compare those who got very sick to those who got better.

Here is the trap: By only looking at people who already have the disease, you have accidentally created a biased group. You've filtered out everyone who was so healthy they never got sick in the first place.

In the world of genetics, this is called Index Event Bias. It's like trying to figure out why some cars break down by only looking at cars that are currently in the repair shop. You might conclude that "red cars break down more often," but that's only because the red cars that didn't break down are still driving around on the highway, and you didn't count them.

This paper asks: How do we fix our detective work when we are forced to only look at the "repair shop" (the sick people)?


The Tools in the Detective's Kit

The authors tested five different "tools" (statistical methods) to see which one could fix this bias. They ran thousands of computer simulations to see which tool worked best.

1. The "Re-Weighting" Scale (Inverse-Probability Weighting)

  • The Analogy: Imagine you have a bag of marbles, but you only see the red ones because the blue ones fell out of the bag. To fix this, you take every red marble you see and say, "Okay, for every red marble I see, I'll pretend there are actually 10 red marbles in the whole bag." You are artificially inflating the weight of the people you do see to represent the people you don't see.
  • The Result: This works pretty well, BUT it requires you to have the original, full list of everyone (individual-level data). If you only have a summary report (like a news headline), you can't use this tool. Also, if your guess about how many blue marbles fell out is wrong, your whole calculation is wrong.

2. The "Magic Mirror" (Heckman's Method)

  • The Analogy: This method tries to use a "magic mirror" (a special genetic clue) that tells you who would have gotten sick but didn't. It tries to reconstruct the missing people.
  • The Result: In this study, the mirror was a bit foggy. It struggled to give clear answers, especially when the data was complex. It's a bit too rigid for this specific job.

3. The "Slope Hunter" (Slope-Hunter)

  • The Analogy: Imagine you are looking at a hill. You know some people are sliding down just because of gravity (the disease event), and some are sliding because they were pushed (the risk factor). This method tries to find the "slope" of the hill by looking at a huge crowd of people and guessing which ones are just sliding naturally.
  • The Result: This tool failed miserably. In the simulations, it kept making things worse. It was like trying to find a needle in a haystack by throwing the whole haystack at the wall. It created a lot of false alarms.

4. The "Two-Pronged Fork" (Multivariable Methods)

  • The Analogy: Instead of just looking at the "sick" group, this method tries to look at two things at once: "Who got sick?" AND "How sick did they get?" It uses a special genetic fork to separate the people who got sick just by bad luck from those who got sick because of the risk factor.
  • The Result: This was the best tool, but with a catch. It only works if you have a special set of genetic clues that affect getting sick but don't affect how sick you get. If the same genetic clues affect both, the fork gets stuck. It's like trying to separate salt from sugar when they are already mixed in the same shaker.

5. The "Super-Fork" (CWBLS)

  • The Analogy: This is a fancy version of the Two-Pronged Fork designed to handle weak clues.
  • The Result: It worked well, but sometimes it was a bit too cautious, missing real effects.

The Real-World Test: COVID-19

The authors didn't just play with computer simulations; they tested these tools on real data about COVID-19. They asked:

  1. Does being overweight (BMI) make you more likely to catch COVID?
  2. Does being overweight make the disease worse if you already have it?
  • The Finding: When they looked at people who were hospitalized (the "repair shop"), the bias made it look like being overweight was less dangerous than it actually was.
  • The Fix: The "Two-Pronged Fork" (Multivariable) and the "Re-Weighting Scale" (IPW) were able to correct this. They showed that yes, being overweight does make the disease worse, even though the raw data from the hospital made it look like it didn't matter as much.

However, for a different drug target (IL6R), the tools struggled because the genetic clues were "confused" (they affected both catching the virus and the severity), proving that no single tool is perfect for every situation.


The Final Verdict: No Silver Bullet

The main takeaway from this paper is simple: There is no magic wand.

  • If you have the full list of everyone (individual data), use the Re-Weighting Scale.
  • If you only have summary reports (like news headlines) and you have special genetic clues that separate "getting sick" from "getting sicker," use the Two-Pronged Fork.
  • If you don't have those special clues, you might be stuck. In that case, the authors suggest: Don't try to study how the disease gets worse. Just study how people get the disease in the first place. It's simpler, less biased, and often gives you the answer you need anyway.

In short: When studying disease progression, be very careful. The "sick" group you are looking at is a filtered, biased sample. You need the right tool to fix the filter, but sometimes the best advice is to stop looking at the filter and look at the whole picture instead.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →