Imagine the world of academic research as a massive, bustling library where millions of scholars write papers every year. For decades, the "voice" of these papers was consistent: a specific style of English, with certain words appearing frequently and others rarely.
Then, a new group of invisible scribes arrived: Large Language Models (LLMs) like ChatGPT, Claude, and Gemini. These AI tools are incredibly smart, but they have their own unique "accents" and habits.
This paper, titled "Beyond Via," is like a linguistic detective story. The authors went into the library (specifically, the arXiv preprint server) to answer two big questions:
- How are these AI scribes changing the way scholars write?
- Can we tell which specific AI wrote a sentence, or are they all starting to sound the same?
Here is the breakdown of their findings, explained with some everyday analogies.
1. The "Accent" of the AI Scribes
Just as a person from New York might say "bodega" and someone from London might say "corner shop," different AI models have developed distinct linguistic habits.
- The "Via" and "Beyond" Craze: The authors noticed that newer AI models love using the words "via" and "beyond" in paper titles. It's like if every new chef suddenly started garnishing every dish with a specific, fancy herb that no one used before.
- The Analogy: Imagine a fashion trend where suddenly, every single person in town starts wearing a specific type of hat. The researchers saw this happening with words in academic titles.
- The Disappearing "The" and "Of": Conversely, the most common words in English, like "the" and "of," are becoming less frequent in abstracts. The AI seems to be trying to sound more "efficient" or "dense," skipping the little connecting words humans use naturally.
- The Analogy: It's like a text message where you drop all the vowels and just send "Wnt2g2st." The AI is doing something similar with academic writing, stripping away the "glue" words.
2. The Chameleon Effect (Models Changing Over Time)
The paper highlights that AI isn't static; it evolves. The "voice" of an AI model from 2023 is different from the "voice" of a model in 2025.
- The "Delve" vs. "Together" Shift: In the early days of ChatGPT, the word "delve" (as in "delve into the data") was a huge red flag for AI writing. But newer models have stopped using it so much. Meanwhile, the word "together" was once rare in AI text but has recently spiked in popularity.
- The Analogy: Think of it like pop music. In 2020, everyone was singing a specific type of ballad. By 2024, the trend shifted to upbeat pop. If you hear a song, you can guess the year it was made based on the style. The authors found that AI writing styles shift just as fast as pop music trends.
3. The "Whodunit" Problem (Can We Detect the AI?)
The researchers tried to build a "detector" to figure out which specific AI wrote a text. They set up a game: "Was this paragraph written by GPT-4, DeepSeek, or a human?"
- The Result: The detectors were good at spotting if AI was used, but terrible at guessing which AI.
- The Analogy: Imagine a police lineup where the suspects are all wearing identical grey jumpsuits. You can easily tell they aren't the regular townspeople (humans), but you can't tell which specific person is standing where. The different AI models are becoming so similar that they are blurring together.
- The Homogenization: The paper suggests that as AI gets better, it's becoming more "human-like," but in a way that makes all the AIs sound like the same generic "super-human." This makes it harder to distinguish between them.
4. The "Crystal Ball" Method (Estimating Impact)
Since the "detective" approach (classifiers) is getting confused, the authors used a simpler, more transparent method: Word Counting.
They treated the academic library like a garden.
- The Baseline: They looked at how often certain weeds (words like "the" or "furthermore") grew in the garden before the AI scribes arrived (2015–2021). They drew a straight line predicting how many weeds should be there in 2025 if humans were still writing alone.
- The Deviation: Then, they looked at the actual garden in 2025. They saw that the "weeds" (specific words) were growing way faster or slower than the line predicted.
- The Conclusion: By measuring this "overgrowth" or "undergrowth," they could estimate how much of the garden was being tended to by AI. They found that by 2025, a significant portion of academic writing has been touched by AI, and this number is growing fast.
The Big Takeaway
The paper concludes that AI is reshaping the landscape of academic writing, not just by writing the papers, but by subtly changing the vocabulary and style of the entire field.
- The Warning: If we rely only on complex "black box" detectors, we might miss the nuance because the AI models are becoming too similar to each other.
- The Insight: Simple tools—like watching which words are becoming trendy or disappearing—are actually very powerful for understanding how technology is changing human communication.
In short: The authors are telling us that the "sound" of academic research is changing. It's becoming slightly more efficient, slightly more "AI-accented," and the models are all starting to sound like the same person. We need to keep our eyes on the little details (like the word "via") to understand the big picture.
Drowning in papers in your field?
Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.