One Language, Two Scripts: Probing Script-Invariance in LLM Concept Representations

This paper demonstrates that Sparse Autoencoder features in Gemma models capture abstract semantics rather than surface orthography by showing that identical Serbian sentences written in completely different tokenized scripts (Latin and Cyrillic) activate highly overlapping features, with this script invariance increasing as model scale grows.

Sripad Karne

Published Wed, 11 Ma
📖 4 min read☕ Coffee break read

Imagine you have a super-smart robot that reads books. You want to know: Does this robot understand the story, or is it just memorizing the letters?

To find out, the researchers in this paper set up a clever experiment using the Serbian language. Here is the breakdown of what they did and what they found, explained simply.

1. The Perfect Test: Two Ways to Write the Same Thing

Serbia is unique because its people write in two different scripts (alphabets) interchangeably: Latin (like English: A, B, C) and Cyrillic (like Russian: А, Б, В).

  • The Magic: You can translate a sentence from Latin to Cyrillic perfectly. The meaning stays 100% the same, but the letters look completely different.
  • The Catch: To a computer (specifically a Large Language Model), these two scripts look like two totally different languages. The computer breaks them down into different "chunks" (tokens) and has no idea that "Hello" in Latin is the same as "Hello" in Cyrillic.

The Analogy: Imagine you have a song.

  • Script A is the sheet music written in standard notes.
  • Script B is the same song written in a secret code of emojis.
  • To a human, it's the same song. To a robot that only reads notes, the emoji version looks like gibberish.

2. The Robot's "Brain" (SAEs)

The researchers didn't just ask the robot what it thought; they looked inside its brain using a tool called a Sparse Autoencoder (SAE).

Think of the robot's brain as a massive room with 65,000 light switches. When the robot reads a sentence, certain switches flip on.

  • If the robot is thinking about "cats," a specific set of switches lights up.
  • If it's thinking about "running," a different set lights up.

The question was: If we show the robot the same sentence in Latin and then in Cyrillic, will the same light switches turn on?

3. The Experiment

They fed the robot sentences in three ways:

  1. The Same Sentence: "The cat sat on the mat" in Latin vs. "The cat sat on the mat" in Cyrillic.
  2. The Same Meaning, Different Words: "The cat sat on the mat" vs. "The feline rested on the rug" (both in Latin).
  3. Random Nonsense: Totally different sentences.

4. The Big Discovery

The results were surprising and exciting:

  • The Robot Cares More About Meaning Than Spelling: When the robot read the same sentence in Latin and Cyrillic, the same light switches flipped on 58% of the time. This is huge! Even though the computer saw two totally different sets of symbols, it recognized the underlying idea.
  • It's Better Than Paraphrasing: Interestingly, the robot recognized the same sentence in two scripts better than it recognized two different sentences that meant the same thing (paraphrases).
    • Analogy: The robot is more confused by you changing your vocabulary ("feline" vs. "cat") than by you changing the alphabet (Latin vs. Cyrillic). It cares more about what you said than how you spelled it.
  • Bigger Brains = Better Understanding: As they tested bigger and smarter versions of the robot (from small to massive), this ability got even stronger. The biggest robots were almost perfect at ignoring the script and focusing on the meaning.

5. Why This Matters

This proves that these AI models aren't just pattern-matching machines that memorize specific words. They are building abstract concepts.

  • The "Ghost" in the Machine: The researchers found that the AI has built a "ghost" version of meaning that floats above the actual letters. Whether you write "Dog" or "Собака," the AI's internal concept of "Dog" is the same.
  • No Cheating: They checked to make sure the robot wasn't just memorizing the training data. Since the specific mix of "Latin Original" and "Cyrillic Paraphrase" likely never appeared together in the robot's training books, the fact that it still recognized the connection proves it's actually understanding, not just remembering.

The Takeaway

This paper shows that modern AI is learning to see the forest, not just the trees. Even when the "trees" (the letters) look completely different, the AI can still see the "forest" (the meaning). This is a massive step forward in understanding how machines learn to think like humans, regardless of the language or alphabet they are using.