A Neural Topic Method Using a Large-Language-Model-in-the-Loop for Business Research

The paper introduces LX Topic, a novel neural topic modeling method that integrates large language model refinement with FASTopic to produce standardized, interpretable, and high-quality document-level topic proportions, thereby establishing a robust and reproducible measurement instrument for business research.

Stephan Ludwig, Peter J. Danaher, Xiaohao Yang

Published 2026-03-05
📖 4 min read☕ Coffee break read

Imagine you are a detective trying to solve a mystery, but instead of a few clues, you have a mountain of 100,000 handwritten letters from customers. Some are angry, some are happy, some are about the food, and some are about the service. If you tried to read them all one by one, you'd go crazy. You need a way to sort these letters into neat piles so you can see the big picture.

This is exactly what LX Topic does, but for business researchers. It's a smart digital assistant that reads thousands of reviews, social media posts, or survey answers and organizes them into clear, understandable themes.

Here is how it works, broken down into simple concepts:

1. The Problem: The "Noisy Room"

Imagine a crowded room where everyone is shouting at once. You can hear words like "pizza," "waiter," "loud," "delicious," and "rude," but it's just a chaotic mess.

  • Old methods (like traditional topic models) were like trying to sort this noise by just looking at which words appear next to each other. They often ended up putting "pizza" and "rude" in the same pile just because they happened to appear in the same sentence, even if they didn't really belong together. The results were confusing and hard to use for real decisions.
  • Newer methods (using big AI models) are like hiring a super-smart translator who can summarize the room, but they are expensive, slow, and sometimes make up stories that aren't actually in the data.

2. The Solution: The "Smart Librarian" (LX Topic)

LX Topic is like a super-smart librarian who has two special skills working together:

  1. The Statistician: First, it uses a powerful math engine (called a "Neural Topic Model") to quickly scan the whole library and group similar books together based on their actual content. It creates the "piles" based on hard data, not guesses.
  2. The Editor: Then, it brings in a Large Language Model (LLM)—think of this as a brilliant editor. The editor doesn't rewrite the books; instead, they look at the labels on the piles and the list of words inside them. If the pile says "Food, Angry, Waiter," the editor might refine the label to "Poor Service & Rude Staff" and swap out confusing words for clearer ones.

The Magic Trick: The editor is very careful. They only polish the labels and the word lists. They do not change the actual piles or how many books are in them. This ensures the math stays accurate while the names become easy for humans to understand.

3. How It Helps Researchers (The "Recipe")

Business researchers use this tool to turn messy text into measurable ingredients for their studies.

  • Before: A researcher might say, "I think people are unhappy."
  • With LX Topic: The researcher can say, "30% of the reviews this month were about 'Slow Delivery', and that number went up by 10% compared to last month."

Because the tool gives a specific percentage for every document (e.g., "This review is 80% about 'Food Quality' and 20% about 'Price'"), researchers can plug these numbers into spreadsheets and run statistical tests, just like they would with sales numbers or survey scores.

4. The "One-Click" Magic

The best part is that you don't need to be a computer programmer to use it.

  • The Old Way: You needed to know complex coding languages (Python, R) to build your own sorting machine.
  • The LX Way: It's a web app (like a website). You drag and drop your Excel file, click a button, and wait a few minutes. The system cleans the data, sorts it, labels it, and gives you back a new file with all the answers ready to use. It even deletes your data after a week to keep your secrets safe.

Summary Analogy

Think of LX Topic as a smart juicer for ideas.

  • The Fruit: Your messy, unstructured text (reviews, tweets, comments).
  • The Machine: The LX Topic algorithm.
  • The Juice: Clean, measurable, and labeled "topic proportions" that you can drink (analyze) to understand what your customers are really thinking.

It takes the chaos of human language, filters out the noise, and hands you a clear, organized report that helps businesses make better decisions.