Large language models can disambiguate opioid slang on social media

This paper demonstrates that large language models significantly outperform traditional lexicon-based strategies in accurately disambiguating ambiguous opioid slang and identifying relevant social media posts across lexicon-based, lexicon-free, and emergent slang scenarios, thereby enhancing the monitoring of the opioid overdose crisis.

Kristy A. Carpenter, Issah A. Samori, Mathew V. Kiang, Keith Humphreys, Anna Lembke, Johannes C. Eichstaedt, Russ B. Altman

Published Thu, 12 Ma
📖 4 min read☕ Coffee break read

Imagine you are trying to find a few specific, rare needles hidden inside a massive, chaotic haystack the size of a city. This is the challenge researchers face when trying to monitor the opioid crisis on social media.

Every day, millions of people post on Twitter, Reddit, and other platforms. Most of these posts are about cats, weather, or dinner. Only a tiny fraction mention drugs. To find the "needles" (posts about opioids), researchers used to use a keyword list (a lexicon). They would say, "If a post contains the word 'heroin' or 'fentanyl,' keep it."

But here's the problem: Drug users often use slang to hide from police and algorithms. They might call heroin "smack" or fentanyl "fenty." The trouble is, these words have innocent meanings too. "Smack" can mean hitting a drum; "fenty" can mean a makeup brand; "lean" can mean being skinny.

If you just search for those words, you get thousands of false alarms (haystack noise) and miss the real needles because the slang changes faster than your list can be updated.

The New Solution: The "Super-Reader" AI

This paper introduces a new tool: Large Language Models (LLMs). Think of these AI models not as simple search engines, but as super-readers who have read almost everything on the internet. They don't just look for a specific word; they understand the context and the vibe of a sentence.

The researchers tested four of the smartest AI models available (GPT-4, GPT-5, Gemini, and Claude) to see if they could act as these super-readers. They set up three different "games" to test them:

Game 1: The "Ambiguous Word" Challenge

The Setup: The AI was given a list of posts containing tricky words like "fenty" or "smack."
The Task: Decide: Is this person talking about drugs, or are they just talking about makeup or hitting a drum?
The Result: The old keyword lists were terrible at this. They either missed almost everything or flagged thousands of innocent posts. The AI, however, acted like a detective. It looked at the whole sentence and said, "Ah, this 'smack' is about hitting a drum, but this other 'smack' is definitely about heroin." The AI was vastly more accurate than the old lists.

Game 2: The "No Clues" Challenge

The Setup: The AI was given a huge pile of random posts from New York and California. It wasn't told to look for any specific words.
The Task: Find the opioid posts from scratch.
The Result: This is where the AI really shined. The old keyword lists missed about 60-90% of the drug posts because they didn't have the right slang words on their list. The AI, using its "common sense" and understanding of how people talk, found almost all of them. It cast a wider net without getting confused by the noise.

Game 3: The "Fake Slang" Challenge

The Setup: The researchers took real drug posts and replaced the slang words with names of Pokémon (like "Charizard" or "Pikachu").
The Task: Could the AI figure out that even though the word was fake, the context still meant it was about drugs?
The Result: Yes! The AI realized that even if someone said, "I'm feeling the Charizard," in a context that usually describes being high, it was likely about drugs. This proves the AI isn't just memorizing a dictionary; it understands the situation.

Why This Matters

Think of the old method (keyword lists) as a bouncer with a strict guest list. If your name isn't on the list, you don't get in. But drug users just change their names (slang) to sneak in, or the bouncer accidentally lets in thousands of people who look like they're on the list but aren't.

The new method (AI) is like a very observant security guard. They don't need a list. They watch how people act, what they say, and who they are with. They can spot the suspicious behavior even if the person is wearing a disguise.

The Bottom Line

The study shows that these AI models are incredibly good at finding the "needles" in the haystack. They make fewer mistakes than the old keyword lists and catch many more drug-related posts that were previously invisible.

Important Note: The authors are very careful to say this tool is for public health monitoring (like a weather radar for a storm), not for spying on individuals. They want to use this data to help communities prevent overdoses and offer treatment, not to arrest people or censor their posts.

In short: AI is becoming a powerful new microscope for understanding the opioid crisis, helping us see the problem clearly so we can fix it.