A Dataset for Probing Translationese Preferences in English-to-Swedish Translation

This paper introduces the first freely available English-to-Swedish dataset designed to benchmark language models' tendency to prefer literal "translationese" over idiomatic phrasing, revealing that exposure to source text biases models toward unnatural translations even when context is removed.

Jenny Kunz, Anja Jarochenko, Marcel Bollmann

Published Tue, 10 Ma
📖 4 min read☕ Coffee break read

Imagine you are learning a new language, say Swedish, by watching thousands of movies and reading subtitles. You start to speak, but you sound a bit like a robot who just learned the dictionary but never hung out with locals. You use words that are technically correct but sound stiff, awkward, or like a direct copy-paste from English. In the world of linguistics, this "robot accent" is called Translationese.

This paper is like a detective story where researchers built a special tool to catch AI models doing exactly this. Here is the breakdown in simple terms:

1. The Problem: The "Robot Accent"

When computers translate English to Swedish, they often produce text that is grammatically correct but feels unnatural. It's like ordering a coffee and saying, "I would like a liquid bean beverage," instead of "I'd like a coffee." It works, but it sounds weird.

The researchers found that even the smartest AI models (Large Language Models) are guilty of this. They tend to stick too closely to the English source text, resulting in Swedish that feels stiff and "translated" rather than natural and "native."

2. The Solution: A "Spot the Difference" Game

To fix this, the team created a new dataset (a collection of data) called a Minimal Pair Probe. Think of this as a "Spot the Difference" game for sentences.

For every sentence, they created two versions:

  • The "Robot" Version: A literal, stiff translation that sounds like Translationese.
  • The "Human" Version: A natural, idiomatic Swedish sentence that a native speaker would actually say.

They also added "error tags," which are like red flags the researchers put on the sentences to explain why the robot version was weird. Was it missing a word? Did it use the wrong slang? Did it translate an idiom too literally?

3. The Experiment: Testing the AI's Taste Buds

The researchers then fed these sentence pairs to various AI models and asked a simple question: "Which one sounds better?"

They tested the AI in two different scenarios:

  • Scenario A (The Blind Taste Test): They showed the AI just the Swedish sentences, without telling them what the original English was.
  • Scenario B (The Translation Task): They showed the AI the English sentence and said, "Translate this into Swedish," giving them the context.

4. The Results: The AI's Bad Habits

The findings were quite revealing:

  • The AI Loves the Robot Accent: Even when the AI wasn't forced to translate, it often preferred the stiff, "robot" version of the Swedish sentence. It seems the AI has a built-in bias toward literal, word-for-word phrasing.
  • The "Source Language" Trap: When the AI was given the English source sentence (Scenario B), it became even more likely to choose the stiff translation. It's as if the English sentence acts like a magnet, pulling the AI toward a literal translation and away from natural Swedish.
  • Context Helps, But Not Enough: Giving the AI more background story (like the sentences before the target sentence) helped it choose the natural version more often. It's like giving a translator a whole chapter of a book instead of just one sentence; they get the vibe better. However, even with a lot of context, the AI still struggled to fully ditch the "robot accent."
  • Bigger Isn't Always Better: Interestingly, making the AI models bigger and smarter didn't always fix the problem. Sometimes, the bigger models actually got worse at spotting the natural Swedish when they were trying to translate from English.

5. Why This Matters

Think of this dataset as a gym for AI. Just as a weightlifter needs specific weights to build muscle, AI models need specific, high-quality data to learn how to speak naturally.

Currently, many AI models are trained on internet data that is full of these "robot translations." This paper provides a free, open tool (a dataset) that researchers can use to train their models to stop sounding like robots and start sounding like real Swedes.

In a nutshell:
The authors built a "taste test" to prove that AI models sound like stiff robots when translating. They found that showing the AI the original English text makes it sound even more robotic. Their new dataset is a training tool designed to help future AI models learn to speak with a natural, human voice, rather than a literal, translated one.