The Consensus Trap: Dissecting Subjectivity and the "Ground Truth" Illusion in Data Annotation

This systematic literature review critiques the "ground truth" paradigm in machine learning as a positivistic fallacy that misinterprets human disagreement as noise, arguing instead for pluralistic annotation infrastructures that treat diverse subjective perspectives as high-fidelity signals essential for building culturally competent models.

Sheza Munir, Benjamin Mah, Krisha Kalsi, Shivani Kapania, Julian Posada, Edith Law, Ding Wang, Syed Ishtiaque Ahmed2026-03-09🤖 cs.AI

IntelliAsk: Learning to Ask High-Quality Research Questions via RLVR

This paper introduces IntelliAsk, a question-generation model trained via RLVR with a novel reward model (IntelliReward) and DAPO optimization to produce high-quality, evidence-based research questions that outperform human reviewers and strong baselines in expert evaluations while also enhancing broader reasoning and writing capabilities.

Karun Sharma, Vidushee Vats, Shengzhi Li, Yuxiang Wang, Zhongtian Sun, Prayag Tiwari2026-03-09🤖 cs.AI

Diverse Word Choices, Same Reference: Annotating Lexically-Rich Cross-Document Coreference

This paper proposes a revised annotation scheme for cross-document coreference resolution that treats coreference chains as discourse elements to better capture lexical diversity and framing variations in news media, demonstrating through the reannotation of NewsWCL50 and ECB+ datasets that this approach enables more balanced and discourse-aware analysis.

Anastasia Zhukova, Felix Hamborg, Karsten Donnay, Norman Meuschke, Bela Gipp2026-03-09💬 cs.CL

CoME: Empowering Channel-of-Mobile-Experts with Informative Hybrid-Capabilities Reasoning

The paper proposes CoME, a novel mobile agent architecture that employs four specialized experts with a progressive training strategy and an InfoGain-Driven DPO method to achieve balanced, decoupled enhancement of hybrid reasoning capabilities, outperforming existing dense and MoE approaches on AITZ and AMEX datasets.

Yuxuan Liu, Weikai Xu, Kun Huang, Changyu Chen, Jiankun Zhao, Pengzhi Gao, Wei Liu, Jian Luan, Shuo Shang, Bo Du, Ji-Rong Wen, Rui Yan2026-03-09🤖 cs.AI

Omni-C: Compressing Heterogeneous Modalities into a Single Dense Encoder

The paper introduces Omni-C, a single dense Transformer encoder that compresses heterogeneous modalities (text, audio, and image) into shared representations via unimodal contrastive pretraining, thereby eliminating the parameter overhead and routing complexity of Mixture-of-Expert architectures while achieving comparable performance with significantly reduced memory usage.

Kin Wai Lau, Yasar Abbas Ur Rehman, Lai-Man Po, Pedro Porto Buarque de Gusmão2026-03-09🤖 cs.AI

Attention Meets Reachability: Structural Equivalence and Efficiency in Grammar-Constrained LLM Decoding

This paper establishes that while language-equivalent context-free grammars yield identical token masks in grammar-constrained decoding, their structural differences significantly impact computational efficiency by introducing variable state-space blowups and ambiguity costs, leading to fundamental lower bounds on decoding work and new distortion metrics for masked sampling.

Faruk Alpay, Bilge Senturk2026-03-09🤖 cs.LG

EigenData: A Self-Evolving Multi-Agent Platform for Function-Calling Data Synthesis, Auditing, and Repair

The paper introduces EigenData, a self-evolving multi-agent platform that automates the synthesis, auditing, and repair of high-quality function-calling training data, demonstrating its effectiveness by systematically correcting the Berkeley Function-Calling Leaderboard (BFCL-V3) to achieve model rankings that better correlate with human judgments of functional correctness.

Jiaao Chen, Jingyuan Qi, Mingye Gao, Wei-Chen Wang, Hanrui Wang, Di Jin2026-03-09🤖 cs.AI

Safer Reasoning Traces: Measuring and Mitigating Chain-of-Thought Leakage in LLMs

This paper investigates how Chain-of-Thought prompting exacerbates the leakage of personally identifiable information in large language models, demonstrating that leakage varies significantly by model family and reasoning budget, and evaluating various lightweight inference-time gatekeepers to propose hybrid policies that balance reasoning utility with privacy protection.

Patrick Ahrend, Tobias Eder, Xiyang Yang, Zhiyi Pan, Georg Groh2026-03-09💬 cs.CL

RACAS: Controlling Diverse Robots With a Single Agentic System

The paper introduces RACAS, a robot-agnostic agentic system that uses natural language communication between LLM/VLM-based modules to control diverse robotic platforms without requiring code modifications or retraining, successfully demonstrating its effectiveness across wheeled, multi-jointed, and underwater robots.

Dylan R. Ashley, Jan Przepióra, Yimeng Chen, Ali Abualsaud, Nurzhan Yesmagambet, Shinkyu Park, Eric Feron, Jürgen Schmidhuber2026-03-09🤖 cs.AI