Signal in the Noise: Decoding the Reality of Airline Service Quality with Large Language Models

This study validates a Large Language Model framework that analyzes over 16,000 unstructured TripAdvisor reviews to uncover critical service quality drivers and a stark post-2022 satisfaction decline for EgyptAir that traditional metrics failed to detect, demonstrating the model's superiority in transforming passenger feedback into actionable strategic intelligence.

Ahmed Dawoud, Osama El-Shamy, Ahmed Habashy2026-03-06💻 cs

Probing Memes in LLMs: A Paradigm for the Entangled Evaluation World

This paper introduces the "Probing Memes" paradigm, which conceptualizes large language models as collections of cultural genes to replace traditional separate evaluations with an entangled framework that uses a Perception Matrix to analyze model-item interactions, revealing hidden capability structures and enabling population-based behavioral analysis across thousands of models and datasets.

Luzhou Peng, Zhengxin Yang, Honglu Ji + 6 more2026-03-06💻 cs

Simulating Meaning, Nevermore! Introducing ICR: A Semiotic-Hermeneutic Metric for Evaluating Meaning in LLM Text Summaries

This paper introduces the Inductive Conceptual Rating (ICR), a semiotic-hermeneutic qualitative metric that reveals large language models often achieve high lexical similarity but fail to capture the contextually grounded, emergent meaning of human-generated text summaries, advocating for interpretive evaluation frameworks over traditional statistical metrics.

Natalie Perez, Sreyoshi Bhaduri, Aman Chadha2026-03-06💻 cs