Is Inference Conditional on Not Rejecting a Pre-test Less Reliable than Unconditional Inference?

This paper demonstrates that conducting inference only after failing to reject a pre-test for model conditions remains valid and typically conservative, even when the estimator and pre-test are asymptotically dependent, provided the underlying conditions hold.

Original authors: Clément de Chaisemartin, Xavier D'Haultfœuille

Published 2026-04-21✓ Author reviewed
📖 6 min read🧠 Deep dive

This is an AI-generated explanation of the paper below. It is not written by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

The Big Picture: The "Pre-Check" Dilemma

Imagine you are a chef trying to bake a perfect cake (estimating a treatment effect). You have a secret recipe (your statistical model) that works perfectly only if your ingredients are fresh and measured correctly (your assumptions hold).

In the real world, you can't be 100% sure your ingredients are perfect. So, before you bake, you do a pre-check: you smell the flour and taste the eggs to see if they are fresh.

  • If they smell bad: You throw the recipe away and don't bake (you reject the study).
  • If they smell good: You proceed to bake and present the cake to the judges (you report your results).

The Question: Does this "smell test" (the pre-test) mess up the reliability of your cake? Specifically, if you only report the cake when the ingredients passed the smell test, is your cake actually more likely to be a disaster than if you had just baked it blindly?

Many statisticians have worried that this "pre-check" creates a hidden trap, making your results look better than they really are. This paper says: Not necessarily. In fact, you might be safer than you think.


Part 1: When the Ingredients Are Actually Fresh (The "Null" Hypothesis)

Let's assume your ingredients are actually fresh. The recipe is perfect.

The Old Fear:
People thought that because you only show the cake when the smell test passes, you might be "cherry-picking" the best-looking cakes. They worried that even if the ingredients were fresh, the act of smelling them first might make the final cake slightly less reliable (or "under-cover," meaning the judges might think it's perfect when it's actually a bit burnt).

The Paper's Discovery:
The authors prove that if your ingredients are truly fresh, your cake is actually safer than you think.

  • The Analogy: Imagine the "smell test" and the "baking process" are two friends holding hands. If the ingredients are fresh, and you only bake when the smell test says "Go," you are actually filtering out some of the random bad luck that could have happened during baking.
  • The Result: The paper shows that your cake is conservative. This means the confidence interval (the range of flavors the judges expect) is actually wider than necessary. You are being extra careful.
    • Real-world translation: If you run a pre-test and it passes, your statistical confidence interval is slightly "wider" (more cautious) than it would be if you hadn't tested. This is good! It means you are less likely to make a false claim. You aren't under-covering; you are over-covering (being safe).

The Catch: This safety net only works if the "smell test" and the "baking" aren't perfectly synchronized in a weird way. But in most standard cases, the safety net holds.


Part 2: When the Ingredients Are Spoiled (The "Alternative" Hypothesis)

Now, let's assume your ingredients are actually spoiled (your assumption is wrong). Maybe the flour is moldy, but you didn't notice.

The Reality:
If the ingredients are bad, the cake will taste terrible no matter what. The statistical estimate is biased.

The Comparison:
The authors ask: If the ingredients are bad, is the cake worse if you did the smell test first, compared to just baking blindly?

  • Scenario A (No Pre-Test): You bake blindly. The cake is terrible. The judges realize it's bad 20% of the time (low coverage).
  • Scenario B (Pre-Test): You do the smell test. It passes (you missed the mold). You bake. The cake is terrible.

The Surprise:
The paper finds that in many common situations (like Randomized Controlled Trials or Instrumental Variables), the cake is actually less terrible in Scenario B than in Scenario A.

  • The Analogy: Imagine the "smell test" is a filter. Even if it lets some bad flour through, it filters out the worst batches of bad flour. By only baking when the test passes, you are inadvertently selecting for the "least spoiled" ingredients.
  • The Result: The "Conditional Coverage" (how often the judges are right given you passed the test) is often higher than the "Unconditional Coverage" (how often they are right if you baked everything).
    • Real-world translation: Pre-testing doesn't just fail to hurt you; in some cases, it actually protects you from the worst errors when your assumptions are slightly wrong.

Part 3: The "Difference-in-Differences" (DID) Warning

The authors do have one specific warning for a popular method called Difference-in-Differences (DID), often used in economics to study policy changes.

  • The Analogy: In DID, the "smell test" checks if two groups were moving in parallel before a policy change. If they weren't, the test fails.
  • The Problem: In these specific studies, the relationship between the "smell" and the "baking" is tricky. The paper shows that in DID studies, if the trends are slightly different (spoiled ingredients), the pre-test might not filter out the worst cases as effectively as in other methods.
  • The Data: The authors looked at 12 famous DID studies. They found that while pre-testing didn't make things much worse, it didn't offer the same "super-protection" it did in other types of studies. The cake was still a bit risky, but not a disaster.

The Takeaway: Should You Stop Pre-Testing?

No. The paper argues that the "Pre-Test" is a good thing, despite the fears of some statisticians.

  1. If you are right: Pre-testing makes your results more conservative (safer). You are less likely to claim a discovery that isn't there.
  2. If you are slightly wrong: Pre-testing often acts as a shield, filtering out the worst errors and keeping your results more reliable than if you had ignored the test entirely.
  3. The Cost: The only "cost" is that you might occasionally throw away a perfectly good cake because the smell test was too sensitive (a false rejection). But the paper suggests this cost is small compared to the benefit of avoiding bad cakes.

In Simple Terms:
Think of pre-testing like a security checkpoint at an airport.

  • Old View: "Checking everyone's bags slows things down and might make the flight less efficient."
  • This Paper's View: "Checking bags actually makes the flight safer. Even if the scanner isn't perfect, the people who pass the scan are statistically less likely to be carrying a bomb than the general population. And if the scanner is right, the flight is safer than if we didn't scan anyone at all."

The authors conclude that researchers should feel comfortable doing these pre-tests. They don't break the math; they often make the results more robust.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →