Random Utility with Aggregation

Imagine you are a detective trying to figure out what people really want to eat for breakfast. You have a dataset showing that people buy "Cereal," "Toast," or "The Outside Option."

In the world of economics, "The Outside Option" is a catch-all bucket. It represents everything else people could eat if they didn't buy cereal or toast: pancakes, omelets, a bagel, or even just skipping breakfast entirely.

This paper, written by Yuexin Liao, Kota Saito, and Alec Sandroni, tackles a tricky problem: What happens when that "Outside Option" bucket is a mystery?

The Core Problem: The "Black Box" Bucket

Usually, economists use a tool called an Aggregated Random Utility Model (ARUM). Think of this as a simple map. It assumes that "The Outside Option" is just one single, boring item on the menu, like a generic "Other."

But in reality, "The Outside Option" is a Black Box.

For one person, it might be a delicious omelet.
For another, it might be a sad, dry piece of toast.
For a third, it might be nothing at all because they are fasting.

The economist (the detective) doesn't know what's inside the box for any specific person. They only see the final choice: "Cereal" or "Outside Option."

The authors ask: If we treat this mystery box as a single, simple item (like the standard ARUM does), will we get the wrong answer?

The Big Discovery: The Map is Wrong

The paper says: Yes, you will get the wrong answer, and the error can be huge.

Here is the analogy:
Imagine you are trying to guess the average height of a group of people.

The Standard Method (ARUM): You assume everyone in the "Outside Option" group is exactly 5 feet tall. You measure the group and calculate an average.
The Real World (RU with Aggregation): The "Outside Option" group actually contains a mix of 3-foot children and 7-foot basketball players.

If you use the standard method, your math breaks. You might conclude that "Cereal" is more popular than it really is, or that "Toast" is less popular. You might even get the ranking backwards, thinking people prefer Toast over Cereal when they actually prefer Cereal.

The Three Rules of the Game

The authors break down exactly how this "mystery box" changes the rules of the game in three ways:

The "Limited" Rule: In the real world, adding a new item to the menu (like a fancy new cereal) doesn't always make the "Outside Option" less popular. Sometimes, seeing a fancy cereal tells you, "Oh, this is a rich neighborhood!" Suddenly, the "Outside Option" might actually become more attractive because it now includes fancy things like smoked salmon. The standard model can't handle this; it thinks adding options always makes the "outside" less likely.
The "Default" Behavior: Sometimes, when people are faced with a confusing menu, they just grab the "Outside Option" (the default) without thinking. The standard model assumes people always think hard about every option. The new model admits that people sometimes just default to the bucket.
The Shape of the Solution: Mathematically, the set of possible answers the new model allows is a giant, complex shape (a polytope). The standard model is just a tiny, simple triangle inside that giant shape. The standard model is missing almost all the possible realities.

When Can We Trust the Simple Model?

The authors don't just say "the simple model is bad." They say, "It's bad unless two specific conditions are met."

Think of these as the Safety Checks:

The "Neighbors" Check (Non-overlapping Preferences):
Imagine the "Outside Option" bucket contains Pancakes and Omelets.
- Bad: If some people love Omelets but hate Pancakes, and others love Pancakes but hate Omelets, they are "overlapping." The bucket is messy.
- Good: If everyone who likes Omelets also likes Pancakes (or vice versa), and they are always ranked together in people's minds, the bucket is safe to treat as a single item.
- Analogy: It's safe to group "Red Cars" and "Blue Cars" together if everyone who likes Red also likes Blue. It's dangerous to group them if Red-lovers hate Blue.
The "Stable Menu" Check (Menu Independence):
Does the contents of the bucket change depending on what else is on the menu?
- Bad: If the "Outside Option" changes from "Omelets" to "Pancakes" just because you added a new brand of cereal to the store, the bucket is unstable.
- Good: If the "Outside Option" is always the same mix of items, regardless of what else is for sale, the bucket is stable.
- Analogy: If you go to a coffee shop, the "Other Drinks" bucket should always contain the same teas and juices. If the bucket suddenly changes to "Soup" just because you ordered a latte, the model breaks.

The Simulation: How Bad is the Error?

The authors ran computer simulations to see how bad the error gets when they ignore these rules.

The Result: The errors were massive.
The Twist: In some cases, the error was so big that it flipped the ranking. The model would tell you that people prefer "Toast" over "Cereal," even though the real data showed they loved "Cereal" twice as much.

The Takeaway for Real Life

If you are an economist, a marketer, or a policy maker:

Don't just lump everything into "Other." If you are grouping things together (like "All Beef Products" or "All Breakfast Options"), make sure the things inside the group are very similar to each other.
Check your context. Make sure the things inside that group don't change just because the menu changed.
If you can't do that, be careful. Your standard calculations might be leading you to the wrong conclusion, potentially costing you money or leading to bad policies.

In short: The world is messy, and "The Outside Option" is a mystery box. If you pretend that mystery box is a simple, single item, you might end up solving the wrong puzzle entirely.

Here is a detailed technical summary of the paper "Random Utility with Aggregation" by Yuexin Liao, Kota Saito, and Alec Sandroni.

1. Problem Statement

In empirical economics (particularly Industrial Organization), researchers frequently aggregate distinct alternatives into single categories to simplify data analysis. A canonical example is the "outside option," which aggregates all unlisted choices (e.g., all breakfast foods other than specific cereal brands) into a single alternative.

The standard empirical approach assumes an Aggregated Random Utility Model (ARUM), where preferences are defined directly over these aggregates as if they were atomic goods. However, this simplification ignores a critical reality:

Heterogeneity: The underlying alternatives composing an aggregate (e.g., omelettes vs. pancakes within the "outside option") vary significantly in quality and price.
Unobservability: The exact composition of an aggregate often varies across consumers and markets (e.g., due to local availability or supply shocks) and is unobserved by the analyst.
Data Generation: The true data-generating process is a Random Utility Model (RUM) over the underlying alternatives, not the aggregates.

The Core Question: Under what conditions do choice frequencies generated by a RUM over heterogeneous, unobserved underlying alternatives remain consistent with an ARUM? If these conditions fail, what are the testable implications of the true model, and how severe is the estimation bias when an ARUM is incorrectly imposed?

2. Methodology and Framework

2.1 Setup

Aggregates ( $A$ ): A set of choices partitioned into Atomic Aggregates ( $AA$ , single underlying goods) and Non-Atomic Aggregates ( $AN$ , e.g., the outside option, containing multiple underlying goods).
Underlying Alternatives ( $X$ ): The set of actual goods available to consumers.
Aggregation Correspondence ( $\mathcal{X}$ ): A function mapping each aggregate $a \in A$ to a non-empty subset of underlying alternatives $X(a) \subseteq X$ .
Composition Distribution ( $\lambda$ ): A probability distribution over the possible subsets of underlying alternatives that constitute an aggregate within a specific choice set (menu). Crucially, $\lambda$ can be menu-dependent (the composition of the outside option changes based on which other goods are available).
Preference Distribution ( $\mu_X$ ): A distribution over linear orderings (rankings) of the underlying alternatives $X$ .

2.2 Definitions of Rationality

The paper defines three levels of rationality to compare the true process with the empirical tool:

ARU-Rationality: The observed choice frequencies $\rho$ are generated by a RUM directly over the aggregates $A$ .
RU-Rationality: The observed choice frequencies $\rho$ $ρ$ are generated by a RUM over the underlying alternatives $X$ $X$ , integrated over the unknown composition distribution $\lambda$ $λ$ .
- Equation: $\rho(A, a) = \sum_{S} \lambda_A(S) \cdot \mu_X(\text{choice of } a \text{ given composition } S)$ .
Partial RU-Rationality: The choice frequencies satisfy standard RUM restrictions only on menus containing atomic aggregates (no non-atomic aggregates present).

3. Key Contributions and Theoretical Results

The authors characterize the testable implications of RU-rationality across three frameworks, demonstrating that it is substantially weaker than ARU-rationality.

3.1 Characterization via Axioms (Theorem 3.1)

Assuming the set of underlying alternatives is sufficiently rich, a stochastic choice function is RU-rational if and only if it satisfies:

Limited Monotonicity: Adding an atomic alternative to a menu cannot increase the choice share of another atomic alternative. However, adding a non-atomic alternative (like the outside option) does not necessarily decrease the share of atomic alternatives in the same way as in standard RUM.
- Key Insight: Unlike ARUM, RU-rationality does not require monotonicity when adding non-atomic alternatives. The presence of a new good might signal a change in the composition of the outside option (e.g., a premium cereal signals a high-income market where the outside option includes more attractive goods like smoked salmon), potentially increasing the outside option's share.
Partial RU-Rationality: Choices on menus with only atomic alternatives must satisfy standard RUM restrictions (e.g., Block-Marschak polynomials).

3.2 Vertex Characterization (Theorem 3.2)

The set of RU-rational choice functions forms a polytope (the RU Polytope).

Vertices: The vertices are "menu-effect" deterministic choice functions. An agent maximizes a linear order on a specific subset of menus $E$ but defaults to the outside option on all other menus ( $E^c$ ).
Comparison: The ARU polytope is a strict sub-polytope of the RU polytope.
Complexity: The number of vertices in the RU polytope is double-exponentially larger than in the ARU polytope. This implies the set of behaviors consistent with RU-rationality is vastly larger, making ARUM a much more restrictive (and potentially incorrect) assumption.

3.3 Finite Underlying Sets (Theorems 3.3 & 3.4)

When the number of underlying alternatives in the non-atomic aggregate ( $n$ ) is finite:

The set of RU-rationalizable functions $RU(n)$ expands as $n$ increases.
Stabilization: Once $n \ge |AA| + 1$ , the set stabilizes and is fully characterized by Limited Monotonicity and Partial RU-rationality.
Approximation: Even below this threshold, if $n$ is moderately large, the set of RU-rational functions is dense enough that any function satisfying the two axioms is arbitrarily close to a valid RU-rational function.

3.4 Conditions for Equivalence (Section 4)

The paper identifies two independent, necessary, and sufficient conditions under which RU-rationality implies ARU-rationality (justifying the use of ARUM):

Non-Overlapping Preferences (Proposition 4.1): For every aggregate, the underlying alternatives it represents must occupy adjacent positions in every consumer's preference ranking. (e.g., if the outside option contains A and B, no consumer should prefer an atomic good C over A but prefer B over C).
Menu-Independent Composition (Proposition 4.2): The distribution $\lambda$ of the aggregate's composition must be invariant to the menu. (e.g., the mix of goods in the outside option does not change based on which cereals are on the shelf).

4. Simulation Results (Section 5)

The authors simulate data using a true Logit model over underlying alternatives with menu-dependent composition and overlapping preferences. They then estimate an ARUM (Logit over aggregates).

Bias Magnitude: When the two equivalence conditions are violated, estimation bias is substantial.
Preference Reversal: In many cases, the bias is large enough to reverse the inferred preference ordering. For example, if the true utility $u(x) > u(y)$ , the ARUM estimate may yield $\hat{u}(y) > \hat{u}(x)$ .
Distance Metric: The Euclidean distance between the observed data and the ARU polytope increases significantly as the composition distribution becomes more menu-dependent or preferences become more overlapping.
Implication: Standard empirical practices that ignore composition heterogeneity can lead to qualitatively wrong conclusions about consumer preferences and market dynamics.

5. Significance and Practical Implications

Theoretical: This is the first paper to formally characterize the testable implications of RUM when aggregates have heterogeneous and unknown compositions. It establishes that the "standard" ARUM is a very strong assumption that is rarely satisfied in the presence of realistic outside options.
Empirical Guidance: The paper provides concrete rules for constructing aggregated datasets to minimize bias:
- Grouping: Only group alternatives that are close substitutes (non-overlapping preferences).
- Stability: Ensure the composition of the aggregate (especially the outside option) does not vary systematically with the menu (e.g., avoid aggregating goods whose availability depends on the specific market context).
Policy/Forecasting: Ignoring these aggregation effects can lead to severe errors in predicting market shares, welfare analysis, and the impact of policy interventions (e.g., introducing a new product).

In summary, the paper argues that while ARUM is a convenient tool, it is theoretically fragile. Researchers should either verify the "non-overlapping" or "menu-independence" conditions or adopt the weaker, more robust implications of RU-rationality (Limited Monotonicity) to avoid significant estimation bias.