cs.CL papers | Gist.Science

Developing an AI Assistant for Knowledge Management and Workforce Training in State DOTs

This paper proposes a multi-agent Retrieval-Augmented Generation framework that integrates open-weight large language models and vision-language models to enhance knowledge management and workforce training in state Departments of Transportation by enabling context-aware, evidence-grounded responses from both textual and visual technical documentation.

Divija Amaram, Lu Gao, Gowtham Reddy Gudla + 1 more2026-03-05🤖 cs.AI

HumanLM: Simulating Users with State Alignment Beats Response Imitation

The paper proposes HumanLM, a novel training framework that improves user simulation by aligning psychologically grounded latent states with ground-truth responses via reinforcement learning, outperforming existing response-imitation methods on the comprehensive Humanual benchmark and in real-time human evaluations.

Shirley Wu, Evelyn Choi, Arpandeep Khatua + 7 more2026-03-05🤖 cs.AI

Draft-Conditioned Constrained Decoding for Structured Generation in LLMs

The paper proposes Draft-Conditioned Constrained Decoding (DCCD), a training-free two-step inference method that decouples semantic planning from structural enforcement to significantly improve the accuracy and parameter efficiency of structured generation in large language models by mitigating the distortions caused by hard constraints.

Avinash Reddy, Thayne T. Walker, James S. Ide + 1 more2026-03-05🤖 cs.AI

Token-Oriented Object Notation vs JSON: A Benchmark of Plain and Constrained Decoding Generation

This benchmark study evaluates Token-Oriented Object Notation (TOON) against JSON for LLM data serialization, finding that while TOON offers promising token efficiency for complex structures via in-context learning, its advantage is often negated by prompt overhead in short contexts and it currently underperforms constrained decoding for simple structures, suggesting its true potential follows a non-linear scaling curve dependent on task complexity.

Ivan Matveev2026-03-05🤖 cs.AI

TopicENA: Enabling Epistemic Network Analysis at Scale through Automated Topic-Based Coding

This paper introduces TopicENA, a scalable framework that integrates BERTopic with Epistemic Network Analysis to automate concept coding, thereby enabling the structural analysis of large text corpora while providing practical guidance on optimizing topic granularity and inclusion thresholds.

Owen H. T. Lu, Tiffany T. Y. Hsu2026-03-05🤖 cs.AI

Old Habits Die Hard: How Conversational History Geometrically Traps LLMs

This paper introduces the History-Echoes framework to demonstrate that conversational history biases large language models by creating a "geometric trap" in latent space, where behavioral persistence is revealed through a strong correlation between probabilistic state consistency and geometric representation alignment.

Adi Simhi, Fazl Barez, Martin Tutek + 2 more2026-03-05🤖 cs.AI

Combating data scarcity in recommendation services: Integrating cognitive types of VARK and neural network technologies (LLM)

This paper proposes a hybrid framework that integrates Large Language Models for semantic enrichment with VARK-based cognitive profiling to effectively address cold start challenges in recommendation systems by generating personalized, explainable suggestions from minimal user data.

Nikita Zmanovskii2026-03-05💬 cs.CL

Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

This paper proposes "entropic-time inference," a novel paradigm that replaces linear token-based decoding with a self-organizing, entropy-driven architecture to dynamically allocate computational resources, optimize attention sparsification, and adapt sampling temperatures for more efficient and intelligent LLM generation.

Andrew Kiruluta2026-03-05🤖 cs.LG

The Logovista English-Japanese Machine Translation System

This paper provides a technical and historical record of the Logovista English-Japanese machine translation system, detailing its rule-based architecture, development practices, and long-term commercial evolution from the early 1990s through 2012, while highlighting preserved artifacts for future study.

Barton D. Wright2026-03-05💬 cs.CL

Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text Decoding

The paper introduces SemKey, a novel framework that decouples semantic guidance into four objectives and redesigns EEG-LLM interactions to eliminate hallucinations and the BLEU trap, achieving state-of-the-art performance in EEG-to-text decoding through rigorous signal-grounded evaluation.

Yuchen Wang, Haonan Wang, Yu Guo + 2 more2026-03-05🤖 cs.AI

How does fine-tuning improve sensorimotor representations in large language models?

This study demonstrates that task-specific fine-tuning can effectively bridge the "embodiment gap" in Large Language Models by steering their internal representations toward grounded, sensorimotor patterns, though these improvements generalize across languages and related dimensions but remain highly sensitive to the specific learning objective.

Minghua Wu, Javier Conde, Pedro Reviriego + 1 more2026-03-05🤖 cs.AI

Towards Self-Robust LLMs: Intrinsic Prompt Noise Resistance via CoIPO

This paper proposes CoIPO, a contrastive learning-based method that enhances the intrinsic robustness of large language models against prompt noise by minimizing the discrepancy between clean and noisy prompt outputs, demonstrating superior performance on the newly introduced NoisyPromptBench benchmark.

Xin Yang, Letian Li, Abudukelimu Wuerkaixi + 5 more2026-03-05🤖 cs.AI

M-QUEST -- Meme Question-Understanding Evaluation on Semantics and Toxicity

This paper introduces M-QUEST, a semantic framework and benchmark comprising 609 question-answer pairs across ten interpretive dimensions, designed to evaluate and advance the ability of large language models to perform commonsense reasoning and toxicity detection in internet memes.

Stefano De Giorgis, Ting-Chih Chen, Filip Ilievski2026-03-05🤖 cs.AI

The Influence of Iconicity in Transfer Learning for Sign Language Recognition

This study demonstrates that leveraging the iconicity of signs in transfer learning from Chinese to Arabic and Greek to Flemish significantly improves sign language recognition performance, particularly yielding a 7.02% gain for Arabic, by utilizing MediaPipe-extracted spatial and temporal features processed through MLP and GRU architectures.

Keren Artiaga, Conor Lynch, Haithem Afli + 1 more2026-03-05🤖 cs.AI

Retcon -- a Prompt-Based Technique for Precise Control of LLMs in Conversations

This paper introduces Retcon, a few-shot prompting technique that enables precise, turn-level control over Large Language Models in multi-turn conversations, demonstrating significantly better performance than zero-shot and traditional few-shot approaches.

David Kogan, Sam Nguyen, Masanori Suzuki + 1 more2026-03-05💬 cs.CL

Quantum-Inspired Self-Attention in a Large Language Model

This paper introduces a classical quantum-inspired self-attention mechanism integrated into GPT-1, which significantly outperforms standard self-attention in character error rate, word error rate, and cross-entropy loss while incurring only a modest increase in inference time.

Nikita Kuznetsov, Niyaz Ismagilov, Ernesto Campos2026-03-05⚛️ quant-ph

Automated Concept Discovery for LLM-as-a-Judge Preference Analysis

This paper introduces an automated concept discovery framework using sparse autoencoders to analyze LLM-as-a-judge preferences, revealing interpretable drivers of model behavior—such as biases toward concreteness, empathy, and formality—that go beyond predefined bias taxonomies and diverge from human evaluations.

James Wedgwood, Chhavi Yadav, Virginia Smith2026-03-05🤖 cs.AI

From We to Me: Theory Informed Narrative Shift with Abductive Reasoning

This paper proposes a neurosymbolic approach that leverages social science theory and abductive reasoning to automatically extract rules for guiding Large Language Models in effectively shifting narratives between individualistic and collectivistic frameworks while preserving the original message's core meaning.

Jaikrishna Manojkumar Patil, Divyagna Bavikadi, Kaustuv Mukherji + 5 more2026-03-05🤖 cs.AI

DIALEVAL: Automated Type-Theoretic Evaluation of LLM Instruction Following

The paper introduces DIALEVAL, a type-theoretic framework that employs dual LLM agents to automatically decompose instructions into typed predicates with differentiated satisfaction semantics, achieving significantly higher accuracy and stronger alignment with human judgment than existing baselines, particularly in complex and multi-turn conversational contexts.

Nardine Basta, Dali Kaafar2026-03-05🤖 cs.AI

Can Large Language Models Derive New Knowledge? A Dynamic Benchmark for Biological Knowledge Discovery

To address the limitations of static benchmarks and data contamination in evaluating AI's capacity for knowledge discovery, this paper introduces DBench-Bio, a dynamic, fully automated, and monthly-updated benchmark covering 12 biomedical sub-domains that rigorously assesses the ability of Large Language Models to derive new biological knowledge.

Chaoqun Yang, Xinyu Lin, Shulin Li + 4 more2026-03-05🤖 cs.AI

← Previous Next →