AILS-NTUA at SemEval-2026 Task 3: Efficient Dimensional Aspect-Based Sentiment Analysis

The AILS-NTUA system addresses the three subtasks of SemEval-2026 Task 3's Dimensional Aspect-Based Sentiment Analysis by combining fine-tuned encoder backbones for sentiment regression with parameter-efficient LoRA-tuned large language models for structured triplet and quadruplet extraction, achieving competitive performance across multilingual and multi-domain settings.

Stavros Gazetas, Giorgos Filandrianos, Maria Lymperaiou, Paraskevi Tzouveli, Athanasios Voulodimos, Giorgos Stamou

Published 2026-03-06
📖 5 min read🧠 Deep dive

Imagine you are a restaurant critic, but instead of just saying "The food was good" or "The service was bad," you are asked to describe the exact feeling of every single thing you experienced. Did the pasta make you feel mildly happy or ecstatically joyful? Did the slow waiter make you feel slightly annoyed or furious?

This paper is about a team of researchers (AILS-NTUA) who built a super-smart computer system to do exactly that, but for six different languages (like English, Chinese, Russian, and Ukrainian) and four different worlds (restaurants, laptops, hotels, and finance).

Here is the breakdown of their work, explained simply:

1. The Big Challenge: "Dimensional" Sentiment

Most computers are like binary switches: they think in Yes/No or Happy/Sad.

  • Old Way: "The laptop battery is good." (Positive)
  • New Way (This Paper): "The laptop battery is good." (Positive, but how positive? Is it a calm, steady satisfaction, or a high-energy excitement?)

The researchers call this Dimensional Sentiment. They use two "dials" to measure feelings:

  • Valence: How positive or negative is it? (Like a scale from 1 to 9).
  • Arousal: How intense or calm is the feeling? (Is it a whisper of joy or a scream of excitement?).

2. The Three Jobs the System Had to Do

The system had to tackle three different puzzles, all at once:

  • Job A (The Scorekeeper): Read a review and give a specific number score for the "Valence" and "Arousal" of a specific item.
    • Analogy: You point at a picture of a burger and say, "That burger is a 7.5 on the happiness scale and a 6.0 on the excitement scale."
  • Job B (The Detective): Read a review and find the Triplet: What was talked about (Aspect), what was said about it (Opinion), and how it felt (The Score).
    • Analogy: Finding the sentence "The pizza was cold" and realizing: Pizza = Aspect, Cold = Opinion, Score = Negative/High Arousal.
  • Job C (The Architect): Do the same as Job B, but also guess the Category.
    • Analogy: Not just knowing "Pizza was cold," but knowing it belongs to the "Food Quality" category.

3. The Secret Sauce: Two Different Tools for Two Different Jobs

The researchers realized that using one giant brain for everything was inefficient. So, they built a hybrid team:

Team 1: The "Specialized Scouts" (For Job A)

For the scoring task, they used small, specialized translators.

  • How it works: They picked a different "expert" for each language. For English, they used a model trained specifically on English nuances. For Russian, a Russian expert.
  • The Metaphor: Imagine you have a team of local guides. If you are in Paris, you hire a French guide. If you are in Tokyo, you hire a Japanese guide. They are small and fast, but they know their specific city better than anyone. This allowed them to get very accurate scores without needing a massive, slow computer.

Team 2: The "Creative Writers" (For Jobs B & C)

For finding the triplets and quadruplets, they used Large Language Models (LLMs)—the kind of AI that writes stories and answers questions.

  • The Trick: Instead of teaching the whole giant brain from scratch (which takes forever and costs a fortune), they used a technique called LoRA.
  • The Metaphor: Imagine a famous novelist (the giant AI) who knows everything about the world. Instead of rewriting their entire life story to learn about restaurants, you just give them a small, sticky note (LoRA adapter) that says, "For this specific job, remember these rules about restaurants."
  • They wrote these "sticky notes" for each language. This made the giant brain smart enough to extract the complex data without needing to be retrained from the ground up.

4. The Results: Small is Beautiful

The team tested their system against other teams using massive supercomputers.

  • The Surprise: Their system, which used smaller, more efficient models, performed just as well (and sometimes better) than the giants.
  • Why it matters: It's like winning a race with a compact sports car instead of a massive, fuel-guzzling truck. You get the same speed, but you use less gas (computing power) and it's easier to park (deploy).

5. The Hiccups (Limitations)

Even superheroes have weaknesses:

  • The "Lost in Translation" Problem: When they tried to translate reviews from Russian to English to help the AI understand them better, the AI sometimes got confused by idioms or lost the original "flavor" of the text.
  • The "Silent" Problem: Sometimes reviews say things like "The service was terrible" without actually naming the "service." The AI sometimes struggled to find these hidden clues, especially in languages with fewer training examples (like Tatar).

Summary

The AILS-NTUA team built a smart, efficient, multi-lingual system that doesn't just tell you if a review is "good" or "bad." It tells you how good or bad it is, and how intense that feeling is. They did this by using a mix of specialized local guides for scoring and giant brains with sticky notes for finding details, proving that you don't always need the biggest computer to get the best results.