cs.CL papers | Gist.Science

Addressing the Ecological Fallacy in Larger LMs with Human Context

This paper demonstrates that addressing the ecological fallacy by modeling an author's language context through a specific task called HuLM, particularly during fine-tuning (HuFT) or continued pre-training, significantly improves the performance of an 8B Llama model across multiple downstream tasks compared to standard training methods.

Nikita Soni, Dhruv Vijay Kunjadiya, Pratham Piyush Shah, Dikshya Mohanty, H. Andrew Schwartz, Niranjan Balasubramanian2026-03-09🤖 cs.AI

Implicit Style Conditioning: A Structured Style-Rewrite Framework for Low-Resource Character Modeling

This paper proposes a Structured Style-Rewrite Framework that combines explicit disentanglement of stylistic dimensions with implicit Chain-of-Thought conditioning to enable small language models to achieve high-fidelity, consistent character role-playing without requiring explicit reasoning tokens during inference.

Chanhui Zhu2026-03-09🤖 cs.LG

Who We Are, Where We Are: Mental Health at the Intersection of Person, Situation, and Large Language Models

This paper proposes an interpretable modeling approach that integrates person-level psychological traits with situational context features derived from social media data to predict dynamic mental well-being, demonstrating that theory-driven methods offer competitive performance and greater human-understandable insights compared to standard language model embeddings.

Nikita Soni, August Håkan Nilsson, Syeda Mahwish, Vasudha Varadarajan, H. Andrew Schwartz, Ryan L. Boyd2026-03-09🤖 cs.AI

Imagine How To Change: Explicit Procedure Modeling for Change Captioning

The paper introduces ProCap, a novel framework that improves change captioning by reformulating static image comparison into dynamic procedure modeling through a two-stage design that learns latent change dynamics from sparse keyframes and utilizes learnable procedure queries to generate temporally coherent descriptions of how changes occur.

Jiayang Sun, Zixin Guo, Min Cao, Guibo Zhu, Jorma Laaksonen2026-03-09🤖 cs.AI

Track-SQL: Enhancing Generative Language Models with Dual-Extractive Modules for Schema and Context Tracking in Multi-turn Text-to-SQL

Track-SQL is a novel framework that enhances generative language models for multi-turn Text-to-SQL tasks by integrating dual-extractive modules for semantic schema and context tracking, achieving state-of-the-art performance on the SparC and CoSQL datasets.

Bingfeng Chen, Shaobin Shi, Yongqi Luo, Boyan Xu, Ruichu Cai, Zhifeng Hao2026-03-09💬 cs.CL

MASFactory: A Graph-centric Framework for Orchestrating LLM-Based Multi-Agent Systems with Vibe Graphing

This paper introduces MASFactory, a graph-centric framework that utilizes a human-in-the-loop "Vibe Graphing" approach to automatically compile natural language intents into executable multi-agent system workflows, thereby addressing challenges in manual implementation, component reuse, and context integration while demonstrating effectiveness across seven public benchmarks.

Yang Liu, Jinxuan Cai, Yishen Li, Qi Meng, Zedi Liu, Xin Li, Chen Qian, Chuan Shi, Cheng Yang2026-03-09🤖 cs.AI

ViewFusion: Structured Spatial Thinking Chains for Multi-View Reasoning

ViewFusion is a two-stage framework that enhances multi-view spatial reasoning in vision-language models by explicitly separating cross-view spatial pre-alignment from question-driven reasoning, achieving significant accuracy improvements on benchmarks like MMSI-Bench through synthetic supervision and reinforcement learning.

Xingjian Tao, Yiwei Wang, Yujun Cai, Yifan Song, Jing Tang2026-03-09💬 cs.CL

Evaluating Austrian A-Level German Essays with Large Language Models for Automated Essay Scoring

This paper evaluates the performance of four state-of-the-art open-weight Large Language Models on Austrian A-level German essay grading using rubric-based prompts, finding that while they can apply standardized criteria, their low agreement rates with human experts (maximum 40.6% on sub-dimensions and 32.8% on final grades) render them currently unsuitable for real-world automated scoring.

Jonas Kubesch, Lena Huber, Clemens Havas2026-03-09🤖 cs.AI

Experiences Build Characters: The Linguistic Origins and Functional Impact of LLM Personality

This study demonstrates that exposing Large Language Models to domain-specific texts via continued pre-training shapes distinct machine personalities that influence problem-solving, revealing a "Suppression Advantage" where reduced social traits enhance complex reasoning while identifying a bimodal competence peak between "Expressive Generalists" and "Suppressed Specialists."

Xi Wang, Mengdie Zhuang, Jiqun Liu2026-03-09🤖 cs.AI

DeepSight: Bridging Depth Maps and Language with a Depth-Driven Multimodal Model

DeepSight is the first dedicated depth-driven multimodal large language model that leverages single-channel depth maps, a novel depth instruction dataset, and a modified ViT encoder to significantly enhance 3D scene understanding and spatial reasoning capabilities.

Hao Yang, Hongbo Zhang, Yanyan Zhao, Bing Qin2026-03-09💬 cs.CL

Making Implicit Premises Explicit in Logical Understanding of Enthymemes

This paper proposes a neuro-symbolic pipeline that integrates large language models for generating implicit premises and translating natural language into logical formulas, alongside a SAT-based reasoner, to systematically decode enthymemes and verify logical entailment, demonstrating promising performance on existing datasets.

Xuyao Feng, Anthony Hunter2026-03-09🤖 cs.AI

Diffusion Language Models Are Natively Length-Aware

This paper proposes a zero-shot mechanism that leverages latent prompt representations to dynamically crop the fixed context window of Diffusion Language Models before generation, significantly reducing computational costs while maintaining or improving performance across diverse tasks.

Vittorio Rossi, Giacomo Cirò, Davide Beltrame, Luca Gandolfi, Paul Röttger, Dirk Hovy2026-03-09🤖 cs.LG

A Causal Graph Approach to Oppositional Narrative Analysis

This paper proposes a graph-based framework that models narratives as entity-interaction graphs and employs causal estimation to distill minimal causal subgraphs, thereby achieving superior performance in classifying oppositional narratives while mitigating the biases inherent in traditional black-box models.

Diego Revilla, Martin Fernandez-de-Retana, Lingfeng Chen, Aritz Bilbao-Jayo, Miguel Fernandez-de-Retana2026-03-09🤖 cs.AI

Do Compact SSL Backbones Matter for Audio Deepfake Detection? A Controlled Study with RAPTOR

This paper introduces RAPTOR, a controlled study demonstrating that multilingual HuBERT pre-training, rather than model scale, is the primary driver of cross-domain robustness and reliable calibration in compact audio deepfake detection systems.

Ajinkya Kulkarni, Sandipana Dowerah, Atharva Kulkarni, Tanel Alumäe, Mathew Magimai Doss2026-03-09🤖 cs.AI

Contrastive-to-Self-Supervised: A Two-Stage Framework for Script Similarity Learning

This paper proposes a two-stage framework that first trains a contrastive encoder on labeled invented alphabets and then uses teacher-student distillation to learn unsupervised, deformation-invariant embeddings for historically attested scripts, effectively bridging supervised discriminative learning with unsupervised discovery of latent cross-script similarities without requiring ground-truth evolutionary relationships.

Claire Roman, Philippe Meyer2026-03-09🤖 cs.AI

CRIMSON: A Clinically-Grounded LLM-Based Metric for Generative Radiology Report Evaluation

This paper introduces CRIMSON, a clinically grounded evaluation framework for chest X-ray report generation that leverages patient context, guideline-based severity weighting, and a comprehensive error taxonomy to achieve superior alignment with radiologist judgments compared to existing metrics.

Mohammed Baharoon, Thibault Heintz, Siavash Raissi, Mahmoud Alabbad, Mona Alhammad, Hassan AlOmaish, Sung Eun Kim, Oishi Banerjee, Pranav Rajpurkar2026-03-09🤖 cs.AI

MAPO: Mixed Advantage Policy Optimization for Long-Horizon Multi-Turn Dialogue

The paper proposes MAPO, a critic-free reinforcement learning algorithm that combines dense process feedback from a judge model with a mixed advantage estimator to enable stable, scalable, and high-performing long-horizon multi-turn dialogue optimization in subjective tasks like emotional support.

Naifan Zhang, Ruihan Sun, Jinwei Su, Hengjie Yang, Zhengyuan Pan, Zhaohan Chen, Xiaofan Zhang2026-03-09🤖 cs.AI

Wisdom of the AI Crowd (AI-CROWD) for Ground Truth Approximation in Content Analysis: A Research Protocol & Validation Using Eleven Large Language Models

This paper introduces the AI-CROWD protocol, which approximates ground truth for large-scale content analysis by aggregating the consensus outputs of an ensemble of large language models to overcome the cost and consistency limitations of human coding.

Luis de-Marcos, Manuel Goyanes, Adrián Domínguez-Díaz2026-03-09💬 cs.CL

LIT-RAGBench: Benchmarking Generator Capabilities of Large Language Models in Retrieval-Augmented Generation

This paper introduces LIT-RAGBench, a comprehensive benchmark dataset and evaluation framework designed to systematically assess Large Language Models' capabilities in Retrieval-Augmented Generation across five critical categories—Integration, Reasoning, Logic, Table, and Abstention—using a mix of human-constructed Japanese questions and curated English translations to guide model selection and development for practical RAG deployments.

Koki Itai, Shunichi Hasegawa, Yuta Yamamoto, Gouki Minegishi, Masaki Otsuki2026-03-09💬 cs.CL

FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling

FlashPrefill is a novel framework that achieves ultra-fast long-context prefilling by combining instantaneous block-searching for dynamic sparse patterns with a thresholding mechanism to eliminate long-tail attention scores, delivering up to a 27.78x speedup on 256K sequences while maintaining efficiency on shorter contexts.

Qihang Fan, Huaibo Huang, Zhiying Wu, Juqiu Wang, Bingning Wang, Ran He2026-03-09🤖 cs.AI

← Previous Next →