arXiv📈 econ.EM 📊 stat.AP

Is Productivity Advantage of Cities Really Down To Mean and Variance?

No explanation available in this language yet.

Try: DE, EN, ES, FR, IT, JA, KO, NL, PT, ZH

1. Problem Statement and Motivation

The central economic puzzle addressed is the systematic productivity advantage of firms located in dense urban areas compared to those in less dense regions. Two primary mechanisms are hypothesized to drive this:

Agglomeration Economies: Dense environments inherently make firms more productive (shifting the entire productivity distribution).
Firm Selection: Intense competition in dense areas culls less productive firms, truncating the left tail of the productivity distribution.

The Combes et al. (2012) Framework:
To disentangle these mechanisms, Combes et al. (2012) (hereafter CDGPR12) proposed a decomposition method based on a critical assumption: Total Factor Productivity (TFP) distributions between dense (Above-Median Density, AMD) and less dense (Below-Median Density, BMD) areas are identical up to three parameters:

Location ( $\mu$ ): Reflects agglomeration.
Scale ( $\sigma$ ): Reflects agglomeration.
Left-tail truncation ( $\xi$ ): Reflects selection.

The Gap:
While CDGPR12's framework is widely used for policy evaluation (e.g., tax incentives, infrastructure), the core assumption of distributional equality (up to these parameters) has never been directly empirically validated.

Challenge 1: TFP is unobserved and must be estimated, introducing measurement noise.
Challenge 2: Naive comparisons of noisy estimates can conflate true heterogeneity with estimation error, leading to invalid inference (Jochmans and Weidner, 2024).
Consequence: If the assumption fails (e.g., if distributions differ in shape or higher moments), the CDGPR12 decomposition is biased, potentially misguiding policy regarding agglomeration vs. competition.

2. Data and Estimation Strategy

Data Source:

Source: Spanish administrative firm-level data (Banco de España's CBI dataset).
Period: 2000–2019.
Coverage: ~80% of incorporated non-financial Spanish firms.
Classification: Firms are categorized into AMD and BMD areas based on the "experienced density" measure (de la Roca and Puga, 2017).

TFP Estimation:
The authors follow the CDGPR12 two-step procedure:

Production Function Estimation: For each sector $s$ and area $a$ , they estimate a Cobb-Douglas production function using the Ackerberg, Caves, and Frazer (2015) control function approach to handle simultaneity bias.
$V_{it} = \exp(\theta_i) K_{it}^{\beta_1} L_{it}^{\beta_2} \exp(U_{it} + \beta_{0,t})$
TFP Calculation: Firm-level TFP ( $\hat{\theta}_i$ ) is calculated as the average residual of the estimated production function over time.

Sample Restriction (Crucial for Validity):
To ensure valid inference and control for estimation noise, the authors restrict the sample to firms observed for at least 15 periods.

Economic Rationale: Aligns with the long-run theoretical framework where selection effects manifest; excludes recently entered, potentially unproductive firms not yet culled.
Statistical Rationale: Reduces the variance of the TFP estimator ( $\hat{\theta}$ ), making the noise manageable for distributional testing.

3. Methodology: Testing Distribution Equality

The authors propose a novel nonparametric testing framework to validate the null hypothesis ( $H_0$ ) that TFP distributions in AMD and BMD areas differ only by location ( $\mu$ ) and scale ( $\sigma$ ), excluding the need for a truncation parameter ( $\xi$ ).

The Null Hypothesis:
$H_0: F_{s,AMD}\left(\frac{\theta - \mu_{s,AMD}}{\sigma_{s,AMD}}\right) = F_{s,BMD}\left(\frac{\theta - \mu_{s,BMD}}{\sigma_{s,BMD}}\right) \quad \forall \theta \in \mathbb{R}$

Technical Innovations in Testing:
Standard Kolmogorov-Smirnov (KS) tests fail here because $\hat{\theta}_i$ are noisy estimates. The authors employ a tailored two-sample KS test with two specific corrections:

Half-Panel Jackknife (HPJ) Debiasing: Following Dhaene and Jochmans (2015) and Jochmans and Weidner (2024), this corrects the bias introduced by the noise in the TFP estimates.
Firm-Level Bootstrap: Used to account for all sources of sampling variability and the uncertainty in estimating the mean and variance parameters.

This approach allows for a valid goodness-of-fit test even when the underlying variable (TFP) is measured with error.

4. Key Results

The study tests the hypothesis across 10 major sectors (Manufacturing, Construction, Wholesale/Retail, Transport, Hospitality, ICT, Real Estate, Professional Services, Admin Services, Arts/Entertainment).

Statistical Findings:
- For all sectors, the authors fail to reject the null hypothesis.
- The bootstrap $p$ -values are uniformly high (ranging from 0.499 to 0.974), indicating no statistically significant difference in the shape of the distributions after standardizing for mean and variance.
- Visual inspection of the debiased Cumulative Distribution Functions (CDFs) confirms that the curves for AMD and BMD areas are nearly identical once location and scale are aligned.
Implication for Parameters:
- The distributions are fully aligned using only the mean ( $\mu$ ) and variance ( $\sigma$ ).
- The left-tail truncation parameter ( $\xi$ ) is not required. This contradicts the necessity of strong selection effects to explain urban productivity gaps.

5. Key Contributions

Empirical Validation: Provides the first direct, nonparametric validation of the CDGPR12 distributional equality assumption using noisy data. It confirms that the assumption holds across diverse sectors in Spain.
Methodological Advancement: Develops a robust framework for testing distributional equality with noisy estimates (combining HPJ debiasing and bootstrapping). This methodology is applicable to other fields involving unobserved heterogeneity with measurement error (e.g., worker skill distributions, mutual fund manager skills).
Refinement of Mechanism: Demonstrates that the productivity advantage of cities is driven almost entirely by agglomeration economies (shifts in mean and variance) rather than selection (tail truncation).

6. Significance and Policy Implications

Policy Focus: Since selection (culling weak firms) does not appear to be the primary driver of the productivity gap, policies aimed at increasing competition or reducing entry barriers may be less effective than previously thought for closing these gaps.
Recommended Interventions: Policymakers should prioritize agglomeration-enhancing policies, such as:
- Improving local infrastructure.
- Fostering labor market thickness.
- Encouraging knowledge spillovers.
Theoretical Confidence: The results validate the use of CDGPR12-style decompositions in future research and policy analysis, ensuring that interpretations of sorting patterns and place-based interventions are not confounded by invalid distributional assumptions.

In conclusion, the paper establishes that the "productivity advantage of cities" is a matter of scale and location shifts in the productivity distribution, not a matter of selection filtering out the bottom tail. This shifts the policy paradigm from competition-focused interventions to agglomeration-focused investments.