CEI: A Benchmark for Evaluating Pragmatic Reasoning in Language Models
This paper introduces the Contextual Emotional Inference (CEI) Benchmark, a dataset of 300 human-validated scenarios designed to evaluate large language models' ability to infer intended meaning beyond literal semantics by navigating ambiguous utterances across diverse power dynamics and pragmatic subtypes.
Jon Chun, Hannah Sussman, Adrian Mangine, Murathan Kocaman, Kirill Sidorko, Abhigya Koirala, Andre McCloud, Gwen Eisenbeis, Wisdom Akanwe, Moustapha Gassama, Eliezer Gonzalez Chirinos, Anne-Duncan Enright, Peter Dunson, Tiffanie Ng, Anna von Rosenstiel, Godwin IdowuThu, 12 Ma💬 cs.CL