GMM and M Estimation under Network Dependence

Imagine you are trying to figure out the "average opinion" of a group of people, but these people aren't just sitting in a room; they are connected in a complex web of friendships, like a giant social network. Some people influence their best friends heavily, others influence their neighbors a little, and some are so far apart they don't really affect each other at all.

This is the world of Network Data.

For a long time, statisticians had great tools to analyze data where everyone is independent (like flipping a coin 1,000 times). But when data is "connected" like a social network, those old tools break down.

Recently, a team of researchers named Kojevnikov, Marmer, and Song (KMS) built a new, powerful engine to handle this network traffic. They figured out how to get reliable answers for simple, straight-line questions. However, they left a gap: their engine couldn't handle complex, non-linear questions (like predicting whether someone will buy a house based on a mix of income, mood, and friend's opinions).

Yuya Sasaki's paper is the missing piece that fills that gap. Here is the story of what he did, explained simply.

1. The Problem: The "One-by-One" vs. The "Whole Picture"

Imagine you are trying to find the highest point on a bumpy, foggy mountain (the "optimal answer" to your question).

KMS's contribution: They proved that if you stand at one specific spot and look around, you can see the ground clearly. They gave you a map for every single point.
The missing piece: To find the highest peak, you need to know that the ground is smooth and predictable everywhere at once, not just at one spot. If the ground is bumpy in a way you can't predict, you might get stuck in a small valley thinking it's the top.

In statistical terms, KMS had a "Pointwise Law" (it works for one spot), but Sasaki needed a "Uniform Law" (it works for the whole map at once). Without this "Uniform Law," you can't trust complex, non-linear models (like GMM or M estimators) to find the right answer.

2. The Solution: Building a "Safety Net" (The ULLN)

Sasaki's main achievement is building a Uniform Law of Large Numbers (ULLN).

Think of the network as a giant, chaotic dance floor.

The Old View: You watch one dancer. You know they will eventually stop dancing and stand still (this is the "Law of Large Numbers").
The New View (Sasaki): You need to watch every dancer on the floor simultaneously and prove that the entire crowd will eventually settle down into a calm, predictable pattern, no matter how they are connected.

Sasaki proved that if the "influence" between friends fades fast enough as you move further apart (like a whisper that gets quieter the further it travels), then the whole network will eventually calm down and behave predictably. He built a mathematical "safety net" that catches the chaos of the network, ensuring that even complex, twisting equations will settle on the right answer.

3. The Tools: GMM and M Estimators

Now that he has the safety net, he shows how to use two specific tools:

M Estimators: Imagine you are trying to guess the "center of gravity" of a wobbly table. You adjust the legs until the table stops shaking. This tool finds the best fit for your data.
GMM Estimators (Generalized Method of Moments): Imagine you are trying to solve a puzzle where you have more clues than you need. You use all the clues to find the one solution that satisfies the most rules.

Sasaki shows that with his new safety net, these tools work perfectly even on a messy, connected network. He proves that:

Consistency: If you keep gathering more data, your answer will get closer and closer to the true truth.
Normality: You can calculate how confident you should be in your answer (like a margin of error).

4. The Practical Guide: "How to Drive This Car"

The paper isn't just theory; it's a manual. Sasaki gives step-by-step instructions on how to:

Calculate the answer: How to run the math on your network data.
Measure the uncertainty: How to build a "confidence interval" that accounts for the fact that your friends influence each other. He suggests using a specific "kernel" (a mathematical smoothing tool) to measure how far the influence reaches before it fades away.

The Big Picture Analogy

Think of the KMS framework as a high-speed train that can travel very fast on a straight track.

The Limitation: The train couldn't turn corners or climb steep hills (non-linear models).
Sasaki's Contribution: He didn't build a new train; he built a new set of tracks and a suspension system that allows that same high-speed train to safely navigate sharp turns and steep hills.

Why This Matters

Before this paper, if a researcher wanted to study complex behaviors in a network (like how a rumor spreads, or how stock prices influence each other in a complex way), they were stuck. They either had to use overly simple models or guess that the math would work without proof.

Sasaki's paper says: "You can now use the most advanced, complex models on network data, and we have the mathematical proof that they won't crash."

He emphasizes that while he built the suspension system, the engine (the foundational theory) belongs to KMS. He is essentially the mechanic who made the engine usable for the rest of us.

Here is a detailed technical summary of the paper "GMM and M Estimation under Network Dependence" by Yuya Sasaki.

1. Problem Statement

The paper addresses a critical gap in the econometric analysis of network-dependent data. While recent literature (specifically Kojevnikov, Marmer, and Song, 2021, hereafter KMS) has established robust limit theorems and variance estimators for linear processes and pointwise convergence under network dependence, these results are insufficient for nonlinear estimation.

The Gap: Nonlinear estimators, such as Generalized Method of Moments (GMM) and M-estimators (including Maximum Likelihood), require a Uniform Law of Large Numbers (ULLN) to ensure that the empirical criterion function converges uniformly to its population counterpart over the entire parameter space.
The Consequence: Without a ULLN, one cannot rigorously establish the consistency or asymptotic normality of nonlinear estimators in network settings. The existing KMS framework provides pointwise limit theorems but does not directly yield the uniform convergence required for these nonlinear applications.

2. Methodology and Framework

Sasaki builds directly upon the KMS (2021) framework, extending it to handle uniform convergence.

A. The Underlying Model: Conditional $\psi$ -Dependence

The paper adopts the conditional $\psi$ -dependence concept from KMS to model network dependence:

Network Structure: Data is represented as a triangular array $\{Y_{n,i}\}$ on a network $G_n$ with adjacency matrix $A_n$ .
Dependence Decay: Dependence between nodes decays as the network distance ( $d_n(i,j)$ ) increases.
Conditional Structure: The dependence is conditioned on a $\sigma$ -algebra $\mathcal{C}_n$ (capturing common shocks), allowing for random decay coefficients.
Key Assumptions:
1. Conditional $\psi$ -dependence: Covariance between functions of distant node sets is bounded by a decay coefficient $\vartheta_{n,s}$ and a function of the Lipschitz constants of the functions.
2. Network Denseness: The average "shell size" (nodes at distance $s$ ) multiplied by the decay coefficient must converge to zero as $n \to \infty$ .

B. Establishing the Uniform Law of Large Numbers (ULLN)

To bridge the gap to nonlinear estimation, Sasaki introduces a novel ULLN (Theorem 1). This requires strengthening the KMS assumptions with conditions on the function class $\{f(\cdot, \theta) : \theta \in \Theta\}$ :

Compactness: The parameter space $\Theta$ is compact.
Moment Bounds: The function $f(Y_{n,i}, \theta)$ has bounded conditional moments.
Uniform Equicontinuity: The function class is uniformly equicontinuous in $\theta$ (specifically, Lipschitz continuous with respect to $\theta$ ).
Result: Under these conditions, the supremum of the difference between the sample average and the conditional expectation converges to zero almost surely:
$E\left[ \sup_{\theta \in \Theta} \left| \frac{1}{n} \sum_{i \in N_n} (f(Y_{n,i}, \theta) - E[f(Y_{n,i}, \theta) | \mathcal{C}_n]) \right| \bigg| \mathcal{C}_n \right] \to 0 \quad \text{a.s.}$

3. Key Contributions

Novel ULLN for Networks: The primary theoretical contribution is the derivation of a Uniform Law of Large Numbers specifically for network-dependent data under the KMS framework. This is the missing link required for nonlinear econometrics.
Extension to Nonlinear Estimators: The paper successfully extends the KMS theory to cover M-estimators and GMM estimators, which are standard tools for limited dependent variable models and moment-based inference.
Practical Inference Procedures: The paper provides complete, implementable guidelines for estimation and inference, including:
- Consistency and asymptotic normality proofs.
- Formulas for Network-HAC (Heteroskedasticity and Autocorrelation Consistent) variance estimators.
- Specific bandwidth selection rules (e.g., using the Parzen kernel) adapted for network data.

4. Main Results

A. M-Estimation

Consistency: Under the ULLN and standard identification assumptions, the M-estimator $\hat{\theta}_M$ is consistent ( $\hat{\theta}_M \xrightarrow{p} \theta_0$ ).
Asymptotic Normality: The estimator is asymptotically normal:
$\sqrt{n}(\hat{\theta}_M - \theta_0) \xrightarrow{d} N(0, H^{-1}\Sigma H^{-1})$
where $H$ is the Hessian of the population criterion and $\Sigma$ is the variance of the score.
Inference: A network-robust variance estimator is constructed using a kernel-weighted sum of outer products of scores over network shells.

B. GMM Estimation

Consistency: The GMM estimator $\hat{\theta}_{GMM}$ is consistent under the ULLN and moment identification conditions.
Asymptotic Normality:
$\sqrt{n}(\hat{\theta}_{GMM} - \theta_0) \xrightarrow{d} N(0, (G^\top W G)^{-1} G^\top W \Omega W G (G^\top W G)^{-1})$
where $G$ is the Jacobian of the moment conditions, $\Omega$ is the variance of the moment conditions, and $W$ is the weighting matrix.
Inference: Similar to M-estimation, a network-robust variance estimator for the moment conditions is provided, allowing for valid hypothesis testing and confidence intervals.

5. Significance and Implications

Bridging Theory and Practice: The paper resolves a practical bottleneck raised by empirical researchers: "Can KMS results be applied to nonlinear GMM estimation?" The answer is yes, provided the new ULLN is utilized.
Enabling Complex Models: By establishing the necessary asymptotic properties, this work enables the application of network econometrics to a wider range of models, including:
- Limited Dependent Variable models (e.g., Probit/Logit with network effects).
- Structural network models estimated via GMM.
- Maximum Likelihood Estimation in network settings.
Foundation for Future Research: While the paper relies heavily on the foundational work of KMS (2021), it provides the essential "next step" for applying those theories to the vast majority of nonlinear econometric problems encountered in empirical research. It explicitly encourages researchers to credit KMS for the underlying limit theory while utilizing this paper for the uniform convergence and nonlinear application machinery.

In summary, Sasaki's paper is a crucial methodological extension that transforms the theoretical limit theorems of network dependence into a practical toolkit for nonlinear estimation, ensuring that consistency and normality results hold uniformly across the parameter space.

GMM and M Estimation under Network Dependence

1. The Problem: The "One-by-One" vs. The "Whole Picture"

2. The Solution: Building a "Safety Net" (The ULLN)

3. The Tools: GMM and M Estimators

4. The Practical Guide: "How to Drive This Car"

The Big Picture Analogy

Why This Matters

1. Problem Statement

2. Methodology and Framework

A. The Underlying Model: Conditional ψ\psiψ-Dependence

B. Establishing the Uniform Law of Large Numbers (ULLN)

3. Key Contributions

4. Main Results

A. M-Estimation

B. GMM Estimation

5. Significance and Implications

More like this

The *-variation of the Banach-Mazur game and forcing axioms

Modified averaged vector field methods preserving multiple invariants for conservative stochastic differential equations

The probabilistic superiority of stochastic symplectic methods via large deviations principles

Hodge-Gromov-Witten theory

Large deviations principles for symplectic discretizations of stochastic linear Schrödinger Equation

A. The Underlying Model: Conditional $\psi$ -Dependence