The Influence of Iconicity in Transfer Learning for Sign Language Recognition

This study demonstrates that leveraging the iconicity of signs in transfer learning from Chinese to Arabic and Greek to Flemish significantly improves sign language recognition performance, particularly yielding a 7.02% gain for Arabic, by utilizing MediaPipe-extracted spatial and temporal features processed through MLP and GRU architectures.

Keren Artiaga, Conor Lynch, Haithem Afli, Mohammed Hasanuzzaman

Published 2026-03-05
📖 4 min read☕ Coffee break read

Imagine you are trying to learn a new language, but you only have a tiny dictionary and a few hours to study. That is the daily reality for researchers trying to teach computers to understand Sign Language. Unlike spoken languages, which have massive libraries of text and audio, sign language datasets are often small, making it hard for computers to learn without getting confused (a problem called "overfitting").

To solve this, researchers usually use a trick called Transfer Learning. Think of this like a student who has already mastered French trying to learn Spanish. Because the languages share similar roots and words, the student doesn't start from zero; they just need to learn the differences.

This paper asks a fascinating question: Does it matter how the signs look?

The Big Idea: "Iconicity"

In sign language, some signs are iconic. This means the hand movement looks exactly like the thing it represents.

  • Example: The sign for "Think" usually involves tapping your forehead. The sign for "Drink" mimics holding a cup. These are "pantomimes."
  • The Theory: If the sign for "Love" looks the same in Chinese Sign Language and Arabic Sign Language (because everyone draws a heart shape with their hands), maybe a computer that learned "Love" in Chinese can instantly understand "Love" in Arabic, even if the two languages are otherwise totally different.

The Experiment: A Cross-Language Swap

The researchers set up a "language swap" test using two pairs of sign languages:

  1. Chinese to Arabic: They took a computer trained on Chinese signs and tried to teach it Arabic.
    • The Match: They focused on signs that were iconic (like "Head," "Hair," "Love").
    • The Result: The computer got 7% better at recognizing Arabic signs. It was like giving the student a cheat sheet that matched perfectly.
  2. Greek to Flemish: They did the same with Greek and Flemish signs.
    • The Match: Again, they used iconic signs (like "Anatomy" and "Food").
    • The Result: The computer got 1% better. It was a small win, but still a win.

The "Magic" Ingredients

To make this work, the researchers didn't feed the computer raw video (which is messy and heavy). Instead, they used Google's MediaPipe.

  • The Analogy: Imagine watching a dance. Instead of recording the whole stage with lights and costumes, you just track the skeleton of the dancer (the joints and bones).
  • The computer only looked at the "skeleton" coordinates of the hands and face. This made the learning much faster and less confused by things like the signer's shirt color or body size.

The Twist: What if the signs don't match?

The researchers also tested what happens if you try to transfer knowledge between languages that don't share many iconic signs.

  • They tried to teach a computer using Iranian signs to recognize French-Belgian signs.
  • These two languages only shared two similar concepts (Anatomy and Sound).
  • The Disaster: The computer actually got worse at its job. This is called "Negative Transfer."
  • The Lesson: It's like trying to teach someone to drive a car by first teaching them how to ride a horse. If the skills are too different, you confuse the learner. You need enough shared "iconic" ground to make the transfer work.

Why Does This Matter?

  1. For Low-Resource Languages: Many sign languages don't have enough data to train computers. This study proves that if you can find "iconic" signs that look similar across languages, you can "borrow" knowledge from a language with lots of data to help the one with little data.
  2. Speed: Even when the accuracy didn't skyrocket, the computer learned faster. It reached the same level of skill in fewer "training sessions" (epochs).
  3. Better than Standard Methods: In one case, using these "iconic" signs worked better than the standard method of using generic image training (like teaching a computer to recognize cats and dogs first).

The Bottom Line

This paper is a proof that visual similarity matters. If two sign languages use the same hand gestures to represent the same ideas (like "eating" or "thinking"), a computer can easily jump from one language to the other. It's a powerful tool for making sign language technology accessible to more people, especially those speaking rare or under-represented sign languages.

In short: If you want to teach a computer a new sign language, start with the signs that look like the things they represent. It's the universal shortcut.