Demand Estimation with Text and Image Data

This paper proposes a demand estimation method that utilizes deep learning embeddings of product images and text to infer substitution patterns, demonstrating superior counterfactual prediction accuracy compared to traditional attribute-based models, particularly in scenarios involving unobserved or hard-to-quantify product attributes.

Giovanni Compiani, Ilya Morozov, Stephan Seiler

Published 2026-02-19
📖 4 min read☕ Coffee break read

Imagine you are trying to figure out why people buy the things they do. In the world of economics, this is called demand estimation. Traditionally, economists have tried to solve this puzzle by looking at a product's "ID card"—its specific, measurable features like price, weight, color, or screen size. They assume that if two products have similar ID cards, people will swap them easily if one goes out of stock.

But here's the problem: Products are more than just their ID cards.

Think about buying a book. You might choose a thriller not just because it's 300 pages long (a measurable fact), but because the cover art looks mysterious, or because the reviews say the plot twists are "mind-blowing." These are unstructured data—things like images, text descriptions, and customer reviews. They are messy, hard to count, and usually ignored by standard economic models because they are too difficult to quantify.

This paper introduces a new tool called DeepLogit that acts like a "super-spy" for economists. It uses Artificial Intelligence (AI) to read the messy stuff and turn it into a clear map of what people actually want.

Here is how it works, broken down into simple steps:

1. The "Magic Translator" (Embeddings)

Imagine you have a giant library of books, but you don't know what any of them are about. You can't read them all.

  • The Old Way: You ask a librarian to write down a few facts for each book (e.g., "Genre: Mystery," "Pages: 300").
  • The New Way (This Paper): You use a super-smart AI robot (a pre-trained deep learning model). You show the robot the book cover and the back-cover blurb. The robot doesn't just read the words; it "feels" the vibe. It turns the image and text into a secret code (called an embedding) that captures the essence of the book.
    • Analogy: It's like the robot gives every book a unique "scent profile." Even if two books have different titles, if they smell like "spooky forest," the robot knows they are similar.

2. The "Compressor" (PCA)

The robot's secret code is huge and complicated (thousands of numbers). You can't put all that into a simple math equation.

  • The Solution: The authors use a technique called Principal Component Analysis (PCA). Think of this as a high-tech vacuum cleaner that sucks out all the noise and leaves only the most important "flavors."
  • It condenses the thousands of numbers into just a few key dimensions (like "Scary vs. Funny" or "Action vs. Romance") that actually matter to buyers.

3. The "Crystal Ball" (Prediction)

Now, the economists plug these "flavors" into a standard math model. They test it with a clever experiment:

  • They ask people to pick their favorite book from a list.
  • Then, they pretend that favorite book is sold out and ask, "What would you buy instead?"
  • The Result: The AI-powered model predicted the "second choice" much better than the old models that only looked at facts like page count. It knew that if you loved a specific fantasy book, you'd likely switch to another fantasy book with a similar "vibe," even if the page counts were different.

Why Does This Matter?

This isn't just about books. The authors tested this on 40 different categories on Amazon, from pet food to video games to clothing.

  • The Surprise: Sometimes, text is more important than pictures. For example, when buying clothes, you might think the photo is everything. But the authors found that for some items, the reviews and descriptions told them more about what people substitute than the photos did.
  • The Benefit: Before this, if a company wanted to know how a price hike on one cereal would affect sales of another, they had to guess the "substitution" based on limited data. Now, they can use the AI to read the product descriptions and reviews to see exactly how similar the products feel to consumers.

The Bottom Line

Think of this paper as giving economists a pair of X-ray glasses.

  • Old Glasses: You could only see the surface features (price, size, color).
  • New Glasses (DeepLogit): You can see the hidden "soul" of the product—the design, the story, and the feeling—that actually drives people to swap one item for another.

This allows businesses and policymakers to make better decisions about pricing, mergers, and new products because they finally understand the real reasons people choose what they choose.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →