Testing for Endogeneity: A Moment-Based Bayesian Approach

Imagine you are a detective trying to figure out why a car's speed (the outcome) changes. You have a suspect: the price of the car (the treatment). You also have a list of other factors that might affect speed, like the car's weight or engine size (the controls).

In a perfect world, you could just look at the data and say, "Ah, higher prices cause lower speed." But in the real world, things are messy. Maybe the price isn't just a random number; maybe it's influenced by the same hidden factors that affect speed (like a secret deal between the manufacturer and the dealer). In statistics, we call this endogeneity. It's like the suspect is secretly talking to the witness before the trial. If you don't catch this, your conclusion will be wrong.

This paper is about building a better detective tool to catch these hidden secrets.

The Problem: The "Naive" Detective vs. The "Smart" Detective

Usually, statisticians use a standard method (the Base Model) that assumes the suspect (price) is innocent and acting independently. They say, "Let's assume price has no secret connection to the error."

But if the suspect is actually guilty (endogenous), this standard method gives you a false verdict. It's like a judge who refuses to listen to evidence of a conspiracy.

To fix this, you need a Smart Detective (the Extended Model). This detective doesn't just assume the suspect is innocent; they explicitly allow for the possibility that the suspect is connected to the hidden factors. They add a special "secret parameter" to the equation to measure that hidden connection.

The Dilemma:

If the suspect is actually innocent, the Smart Detective is overcomplicating things (using more tools than necessary).
If the suspect is guilty, the Naive Detective is lying to you, and you need the Smart Detective.

How do you decide which detective to trust? You need a test.

The Solution: The "Bayesian Scale"

The authors propose a new way to weigh the evidence using something called Bayesian Model Comparison. Think of it as a magical scale.

The Setup: You put the "Naive Detective's" theory on one side of the scale and the "Smart Detective's" theory on the other.
The Weights: Instead of just counting votes, the scale uses a special mathematical weight called Exponentially Tilted Empirical Likelihood (ETEL).
- The Analogy: Imagine you are trying to balance a scale using sand. The "Naive" theory tries to balance the sand assuming the ground is flat. The "Smart" theory allows the ground to be tilted.
- If the ground is actually flat (the suspect is innocent), the "Naive" theory wins because it's simpler and fits the flat ground perfectly. The "Smart" theory is penalized for adding unnecessary complexity.
- If the ground is actually tilted (the suspect is guilty), the "Naive" theory fails miserably—the sand spills everywhere. The "Smart" theory, which expected a tilt, balances the sand perfectly. It wins easily.

The Magic Ingredient: The "Penalty"

The genius of this paper is how it handles the "penalty" for complexity.

In many tests, you have to manually decide how much to punish a complex model. Here, the penalty happens automatically.

When the Smart Detective is right (the suspect is guilty), the evidence is so strong that it drowns out the penalty for being complex.
When the Naive Detective is right (the suspect is innocent), the Smart Detective's extra complexity becomes a burden. The scale naturally tips back to the simpler model.

The authors prove mathematically that as you get more data (more sand, more witnesses), this scale becomes perfectly consistent. It will almost certainly pick the right detective, no matter how tricky the case is.

Real-World Examples

The paper tests this on two famous problems:

Car Prices: Does the price of a car affect how many people buy it?
- The Trap: High prices might be set because the car is popular (demand drives price), not just because price drives demand.
- The Result: The test correctly identified that price is "guilty" (endogenous). When they accounted for this, the estimated effect of price on demand was even stronger than people thought!
Airline Tickets: Do ticket prices affect how many people fly?
- The Trap: Airlines might raise prices on popular routes, making it look like high prices don't stop people from flying.
- The Result: In this specific case, the test found that prices were actually "innocent" (exogenous). The simpler model was the right one.

Why This Matters

Before this paper, Bayesian statisticians (who use probability to update their beliefs) often had to assume variables were innocent. If they were wrong, their whole analysis was garbage.

This paper gives them a self-correcting mechanism. It's like giving the detective a lie detector that automatically adjusts its sensitivity based on the evidence. You don't need to guess if the suspect is guilty; the math figures it out for you, balancing the need for simplicity against the need for accuracy.

In short: This paper builds a smarter, more honest way to test if your variables are "talking" to each other behind your back, ensuring your conclusions about cause-and-effect are actually true.

Here is a detailed technical summary of the paper "Testing for Endogeneity: A Moment-Based Bayesian Approach" by Siddhartha Chib, Minchul Shin, and Anna Simoni.

1. Problem Statement

In Bayesian estimation of linear regression models, a standard assumption is that regressors are exogenous (uncorrelated with the error term). However, in empirical practice, this assumption is often violated (endogeneity), leading to biased estimates and invalid inference. While frequentist methods (e.g., Durbin-Wu-Hausman tests) exist for detecting endogeneity, they do not translate naturally into the Bayesian framework, which typically relies on model comparison rather than parameter testing.

The paper addresses the challenge of testing for endogeneity within a Bayesian framework without imposing strong parametric distributional assumptions on the joint distribution of the data. The goal is to develop a procedure that consistently selects the correct model (exogenous vs. endogenous) as the sample size grows.

2. Methodology

The authors propose a Bayes Factor test based on the Exponentially Tilted Empirical Likelihood (ETEL) framework. This approach avoids specifying a parametric likelihood function, relying instead on moment conditions.

A. Model Specification

The authors define two competing models within a semiparametric linear regression setting:
$y = x'\beta + z_1'\gamma + \varepsilon$
where $x$ is the potentially endogenous treatment, $z_1$ are exogenous controls, and $z_2$ are instrumental variables.

Base Model ( $M_b$ ): Assumes exogeneity. It imposes the moment restrictions:
$E[\varepsilon(\theta)x] = 0, \quad E[\varepsilon(\theta)z_1] = 0, \quad E[\varepsilon(\theta)z_2] = 0$
If $x$ is truly endogenous, this model is misspecified.
Extended Model ( $M_e$ ): Allows for endogeneity by explicitly parameterizing the covariance between the error and the endogenous variable ( $v = E[\varepsilon x]$ ). The moment restrictions become:
$E[\varepsilon(\theta)x] = v, \quad E[\varepsilon(\theta)z_1] = 0, \quad E[\varepsilon(\theta)z_2] = 0$
This model is correctly specified under both exogeneity ( $v=0$ ) and endogeneity ( $v \neq 0$ ).

B. Estimation via ETEL

The authors utilize the ETEL framework (Schennach, 2005) to construct the likelihood.

ETEL Weights: The likelihood is constructed by finding weights $q_i$ that minimize the Kullback-Leibler (KL) divergence from the empirical distribution while satisfying the moment conditions.
Posterior: The posterior distribution is proportional to the prior times the ETEL likelihood.
Bayes Factor: The test statistic is the ratio of the marginal likelihoods of the two models:
$BF_{eb} = \frac{m(w_{1:n} | M_e)}{m(w_{1:n} | M_b)}$
If $\log(BF_{eb}) > 0$ , the extended model (endogeneity) is preferred; otherwise, the base model (exogeneity) is selected.

C. Marginal Likelihood Decomposition

Using the Chib (1995) identity, the log-marginal likelihood is decomposed into:
$\log m(w_{1:n}|M) = \log \pi(\theta^*) + \log \hat{q}(w_{1:n}|\theta^*) - \log \pi_n(\theta^*|w_{1:n})$
The authors establish that asymptotically, this decomposes into:

A term related to the KL divergence between the true data distribution and the closest distribution satisfying the model's moment restrictions.
A penalty term proportional to the number of parameters (similar to BIC), arising from the Jacobian of a local parameter transformation.

3. Key Contributions

The paper makes several significant theoretical and methodological contributions:

Explicit Model Construction for Endogeneity: Unlike previous work (e.g., Chib et al., 2018) which focused on general moment condition model comparison, this paper explicitly constructs the specific base and extended models required to test the hypothesis of endogeneity.
Existence Assumption for ETEL: The authors introduce a novel assumption guaranteeing that the ETEL function exists in a neighborhood of the true parameter with probability approaching one. This addresses a gap in existing ETEL literature where the feasible set of the optimization problem might be empty for certain parameter values.
Direct Proof of Asymptotic Equivalence: They provide a more direct proof that the log-ETEL function is asymptotically equivalent to a quadratic function. This leverages the linearity of the IV regression structure, avoiding the heavy empirical process theory used in previous literature.
Consistency of the Bayes Factor: The paper proves that the proposed test is consistent from a frequentist perspective:
- If $x$ is exogenous, the Bayes factor selects the Base Model ( $M_b$ ) with probability approaching 1 (due to the parsimony penalty, as $M_b$ has fewer parameters).
- If $x$ is endogenous, the Bayes factor selects the Extended Model ( $M_e$ ) with probability approaching 1 (because the misspecification penalty in $M_b$ dominates the parameter penalty).
New Asymptotic Representation: They derive a new representation of the log-marginal ETEL, showing it behaves like a penalized log-ETEL criterion where the penalty arises endogenously from posterior concentration, rather than being imposed ad hoc.

4. Results

Theoretical Results

Theorem 4.4 & 4.5: Establish the consistency of the testing procedure. The test correctly identifies the data-generating process (exogenous vs. endogenous) as $n \to \infty$ .
Bernstein-von Mises Theorems: The authors prove that the posterior distributions of the parameters (both in the base and extended models) converge to a Normal distribution, validating the use of standard asymptotic approximations for the marginal likelihood calculation.
Stochastic LAN: They establish the Stochastic Local Asymptotic Normality (LAN) property for the log-ETEL function, a critical step for proving the consistency of the Bayes factor.

Empirical and Simulation Results

Monte Carlo Simulations: The authors simulate data with varying degrees of endogeneity ( $\rho$ ). The results show that the Bayes factor correctly selects the extended model even for small sample sizes ( $n=250$ ) and small degrees of endogeneity, outperforming frequentist GMM-based criteria (AIC, BIC, HQIC) in finite samples, particularly when endogeneity is weak.
Real Data Application 1 (Automobile Demand): Using the classic BLP (Berry, Levinsohn, Pakes) dataset, the authors test the endogeneity of automobile prices. The Bayes factor strongly favors the extended (endogenous) model. The estimated price elasticity is larger in magnitude when endogeneity is accounted for, and the inclusion of nonlinear controls further refines the estimate.
Real Data Application 2 (Airline Traffic): Using clustered longitudinal data on airfares and passenger volume, the framework is applied to a panel setting. The test suggests that airfares can be treated as exogenous in this specific dataset, demonstrating the method's flexibility with clustered data structures.

5. Significance

This paper bridges a critical gap in econometric methodology by providing a rigorous, distribution-free Bayesian test for endogeneity.

Robustness: By relying on moment conditions (ETEL) rather than parametric likelihoods, the method is robust to misspecification of the joint distribution of errors and regressors.
Bayesian Model Selection: It reframes endogeneity testing not as a hypothesis test on a parameter (which is difficult in Bayesian settings with nuisance parameters) but as a model selection problem, which is more natural for Bayesian inference.
Parsimony vs. Misspecification: The results highlight a sophisticated trade-off: when the null hypothesis (exogeneity) is true, the Bayes factor correctly penalizes the more complex extended model (Occam's razor). When the null is false, the penalty for misspecification in the base model overwhelms the complexity penalty, leading to the correct selection of the extended model.
Practical Utility: The method is computationally feasible using standard MCMC techniques (Metropolis-Hastings) and has been successfully applied to complex real-world datasets with high-dimensional controls and instruments.

In summary, the paper offers a theoretically sound and practically applicable solution for one of the most persistent problems in econometrics, extending the toolkit for causal inference in a Bayesian framework.