Manipulating language models' training data to study syntactic constraint learning: the case of English passivization

This study demonstrates that neural network language models can learn English passivization exceptions by leveraging both frequency-based entrenchment and semantic affectedness from their training data, validating the utility of manipulating training corpora to investigate the sources of linguistic evidence in acquisition.

Cara Su-Yi Leong, Tal Linzen

Published 2026-03-05
📖 6 min read🧠 Deep dive

Imagine you are teaching a robot how to speak English. You give it a massive library of books, articles, and websites to read. The robot is smart; it learns that most sentences follow a pattern. For example, if you can say "The dog chased the cat," the robot learns you can usually flip it around to say "The cat was chased by the dog." This is called the passive voice.

But English is tricky. Sometimes, you can flip the sentence, and sometimes you can't.

  • Good: "The writer defenestrated the editor" \rightarrow "The editor was defenestrated by the writer." (Weird word, but it works!)
  • Bad: "The meeting lasted one hour" \rightarrow * "One hour was lasted by the meeting." (This sounds wrong to us.)

The big mystery is: How does a learner (a child or a robot) know which verbs are "flip-able" and which ones aren't? They never get a teacher to say, "Hey, you can't use 'last' in the passive voice." They have to figure it out just by listening.

This paper is like a detective story where the researchers use a neural network (a type of AI) as a test subject to solve this mystery. Here's the breakdown of their investigation using simple analogies:

The Two Suspects: "Frequency" vs. "Meaning"

The researchers had two main theories about how the robot (and humans) figure out these rules:

  1. The "Frequency" Suspect (Entrenchment):

    • The Theory: If you hear a word used in the active voice (e.g., "The meeting lasted...") a million times, but you never hear it in the passive voice ("One hour was lasted..."), your brain eventually says, "Okay, I guess that's just not how we do things." It's like a path in a park. If everyone walks the main path, but no one ever walks the side path, eventually the side path disappears into the grass. You assume it's not a real path.
    • The Metaphor: Imagine a restaurant. If you order "Steak" every day and the waiter never brings it to the table, but brings "Chicken" every time you ask for "Chicken," you eventually stop ordering "Steak" because you assume the kitchen doesn't make it.
  2. The "Meaning" Suspect (Affectedness):

    • The Theory: The passive voice usually happens when something gets changed or gets hurt by an action. If you "break" a vase, the vase is affected. If you "last" an hour, the hour doesn't really get changed or hurt; it just exists. The theory says the robot learns that if a verb doesn't involve "affecting" something, it can't be passive.
    • The Metaphor: Think of a paintball game. If you get hit (affected), you are the target. If you just stand there and watch the game (not affected), you aren't the target. The passive voice is like saying, "I was hit." You can't say, "I was watched" in the same way if the "watching" didn't change you.

The Experiment: Cooking with Different Ingredients

To test which suspect was guilty, the researchers didn't just watch the robot; they rewrote the robot's library (its training data) to see how it reacted.

Experiment 1: Does the robot sound like a human?
First, they checked if the robot actually understood English grammar. They asked it to judge sentences.

  • Result: Yes! The robot's judgments were almost identical to human judgments. It knew that "The cat was chased" is good, but "One hour was lasted" is bad. This proved the robot was learning from the library, not just memorizing.

Experiment 2A: The "Frequency" Test
They took a verb that humans can use in the passive voice (like "drop") and edited the library so that the robot saw "drop" in the active voice 100 times for every 1 time it saw it in the passive voice (mimicking the pattern of "last").

  • Result: The robot started thinking "drop" was weird in the passive voice, just like "last."
  • Conclusion: Frequency matters. If you see a pattern often enough without the exception, you learn the exception.

Experiment 2B: The "Meaning" Test
They took a verb that humans can't use in the passive voice (like "last") and edited the library so that "last" appeared in sentences where it was "affecting" things (like "The storm lasted the house down"—a made-up sentence where the house gets damaged).

  • Result: The robot started thinking "last" was more acceptable in the passive voice when it was used in these "damaging" contexts.
  • Conclusion: Meaning matters too. If the context suggests the object is being affected, the robot is more willing to use the passive voice.

Experiment 3: The "New Word" Test
To be super sure, they invented a brand new word (let's call it "Zorp") that didn't exist in the library at all. They taught the robot "Zorp" in two ways:

  1. High Frequency: They showed it "Zorp" 1,000 times in active sentences, but zero times in passive.
  2. Meaning: They showed "Zorp" in sentences where it either "hurt" things (High Affectedness) or just "existed" near things (Low Affectedness).
  • Result:
    • The more they saw "Zorp" in the active voice, the less likely they were to use it in the passive (Frequency wins).
    • The more they saw "Zorp" "hurting" things, the more likely they were to use it in the passive (Meaning wins).
    • Crucially: These two factors worked independently. They didn't cancel each other out; they just added up.

The Big Takeaway

The researchers found that both suspects are guilty.

  • Frequency (how often you see a pattern) helps you learn the rules.
  • Meaning (whether the action changes the object) helps you understand why the rule exists.

It's not just one or the other. It's like learning to drive: you learn the rules because you see other cars following them (frequency), but you also understand the logic behind the rules (safety/meaning).

Why Does This Matter?

This study is a big deal because it proves that neural networks (AI) can learn complex language rules just like humans do, using the same "clues" from the environment. It also gives scientists a new way to study human learning: instead of trying to control what a human child hears (which is impossible), we can control what a robot hears, see how it learns, and use that to understand how our own brains might be working.

In short: We learn language by counting patterns and understanding meanings, and AI is finally smart enough to show us how that works.