cs.CL papers | Gist.Science

Supporting Workflow Reproducibility by Linking Bioinformatics Tools across Papers and Executable Code

This paper introduces CoPaLink, an automated approach that enhances bioinformatics workflow reproducibility by integrating Named Entity Recognition and entity linking to connect tool mentions in scientific papers with their corresponding implementations in executable workflow code.

Clémence Sebe, Olivier Ferret, Aurélie Névéol, Mahdi Esmailoghli, Ulf Leser, Sarah Cohen-Boulakia2026-03-10💬 cs.CL

The Conundrum of Trustworthy Research on Attacking Personally Identifiable Information Removal Techniques

This paper argues that current evaluations of attacks on PII removal techniques are flawed due to unmitigated data leakage and contamination, creating a paradox where trustworthy research requires access to private data that is inherently restricted from public scrutiny.

Sebastian Ochs, Ivan Habernal2026-03-10💬 cs.CL

DualTurn: Learning Turn-Taking from Dual-Channel Generative Speech Pretraining

DualTurn is a dual-channel generative speech model that learns natural turn-taking dynamics through unsupervised pretraining on conversational audio and fine-tuning to predict agent actions, outperforming existing methods in both action prediction accuracy and turn-boundary anticipation while enabling tool-calling capabilities.

Shangeth Rajaa2026-03-10💬 cs.CL

Quantifying Cross-Lingual Transfer in Paralinguistic Speech Tasks

This paper introduces the Cross-Lingual Transfer Matrix (CLTM) to systematically quantify language-dependent performance variations in paralinguistic tasks like gender identification and speaker verification, revealing that despite their acoustic nature, these tasks exhibit distinct cross-lingual transfer patterns when using multilingual HuBERT-based encoders.

Pol Buitrago, Oriol Pareras, Federico Costa, Javier Hernando2026-03-10💬 cs.CL

Fibration Policy Optimization

This paper introduces Fibration Policy Optimization (FiberPO), a unified framework that bridges trust-region theory and compositional algebraic structures to enable principled, multi-scale stability control in large language model training through the novel Aggregational Policy Censoring Objective and Fiber Bundle Gating mechanism.

Chang Li, Tshihao Tsu, Yaren Zhang, Chao Xue, Xiaodong He2026-03-10🤖 cs.LG

Sensivity of LLMs' Explanations to the Training Randomness:Context, Class & Task Dependencies

This paper investigates how training randomness affects the stability of Transformer model explanations, demonstrating that while syntactic context, target classes, and task types all significantly influence this sensitivity, the impact is smallest for context, moderate for classes, and largest for tasks.

Romain Loncour, Jérémie Bogaert, François-Xavier Standaert2026-03-10💬 cs.CL

Bootstrapping Audiovisual Speech Recognition in Zero-AV-Resource Scenarios with Synthetic Visual Data

This paper proposes a zero-AV-resource framework for audiovisual speech recognition that generates synthetic talking-head videos by lip-syncing static facial images with real audio, successfully enabling high-performance model training for under-resourced languages like Catalan without the need for labeled video corpora.

Pol Buitrago, Pol Gàlvez, Oriol Pareras, Javier Hernando2026-03-10💬 cs.CL

Not All Queries Need Deep Thought: CoFiCot for Adaptive Coarse-to-fine Stateful Refinement

The paper proposes CoFiCot, an adaptive coarse-to-fine framework that dynamically allocates test-time computation by triaging queries based on multi-metric difficulty assessment and applying stateful, context-aware refinement to balance efficiency and reasoning accuracy.

Dongxu Zhang, Hongqiang Lin, Yiding Sun, Pengyu Wang, Qirui Wang, Ning Yang, Jihua Zhu2026-03-10💬 cs.CL

NCL-UoR at SemEval-2026 Task 5: Embedding-Based Methods, Fine-Tuning, and LLMs for Word Sense Plausibility Rating

This paper presents the NCL-UoR system for SemEval-2026 Task 5, demonstrating that a structured prompting strategy with explicit decision rules for Large Language Models outperforms both embedding-based methods and fine-tuned transformers in rating word sense plausibility.

Tong Wu, Thanet Markchom, Huizhi Liang2026-03-10💬 cs.CL

How Much Do LLMs Hallucinate in Document Q&A Scenarios? A 172-Billion-Token Study Across Temperatures, Context Lengths, and Hardware Platforms

This study utilizes a massive 172-billion-token evaluation across diverse models, context lengths, and hardware to reveal that while model selection is the primary determinant of accuracy, hallucination rates in document Q&A rise significantly with context length and vary non-linearly with temperature, highlighting that grounding ability and fabrication resistance are distinct capabilities.

JV Roig2026-03-10💬 cs.CL

AdaCultureSafe: Adaptive Cultural Safety Grounded by Cultural Knowledge in Large Language Models

The paper proposes AdaCultureSafe, a framework that addresses the lack of correlation between cultural safety and knowledge in Large Language Models by constructing a novel dataset of culturally grounded queries and introducing a knowledge-integrated method to significantly enhance adaptive cultural safety.

Hankun Kang, Di Lin, Zhirong Liao, Pengfei Bai, Xinyi Zeng, Jiawei Jiang, Yuanyuan Zhu, Tieyun Qian2026-03-10💬 cs.CL

Evaluating LLM-Based Grant Proposal Review via Structured Perturbations

This paper evaluates LLM-based grant proposal reviews using structured perturbations on six quality axes, finding that a section-by-section analysis approach outperforms other architectures but that current models still struggle with clarity detection and holistic assessment, suggesting they are best suited as supplementary tools rather than replacements for human reviewers.

William Thorne, Joseph James, Yang Wang, Chenghua Lin, Diana Maynard2026-03-10💬 cs.CL

Using Multimodal and Language-Agnostic Sentence Embeddings for Abstractive Summarization

This paper introduces SBARThez, a novel framework that leverages multimodal and multilingual sentence embeddings alongside a Named Entity Injection mechanism to enhance the factual consistency and cross-lingual capabilities of abstractive summarization for both text and speech inputs.

Chaimae Chellaf, Salima Mdhaffar, Yannick Estève, Stéphane Huet2026-03-10💬 cs.CL

LAMUS: A Large-Scale Corpus for Legal Argument Mining from U.S. Caselaw using LLMs

This paper introduces LAMUS, a large-scale, high-quality sentence-level legal argument mining corpus for U.S. caselaw constructed via an LLM-driven pipeline with human refinement, which demonstrates that chain-of-thought prompting and LLM-assisted verification significantly enhance annotation quality and model performance for future legal NLP research.

Serene Wang, Lavanya Pobbathi, Haihua Chen2026-03-10💬 cs.CL

Learning Multiple Utterance-Level Attribute Representations with a Unified Speech Encoder

This paper proposes a unified post-training framework that extends speech foundation models to generate multiple arbitrary utterance-level attribute representations, demonstrating its effectiveness through the joint learning of semantic and speaker embeddings for multilingual retrieval and speaker recognition tasks.

Maryem Bouziane, Salima Mdhaffar, Yannick Estève2026-03-10💬 cs.CL

SlowBA: An efficiency backdoor attack towards VLM-based GUI agents

This paper introduces SlowBA, a novel backdoor attack against VLM-based GUI agents that utilizes a two-stage reward-level injection strategy and realistic pop-up triggers to induce excessive reasoning chains, thereby significantly increasing response latency while maintaining task accuracy and evading existing defenses.

Junxian Li, Tu Lan, Haozhen Tan, Yan Meng, Haojin Zhu2026-03-10💬 cs.CL

SPD-RAG: Sub-Agent Per Document Retrieval-Augmented Generation

SPD-RAG is a hierarchical multi-agent framework that improves scalability and answer quality for complex cross-document queries by assigning dedicated agents to process individual documents and synthesizing their outputs through a token-bounded coordinator, achieving superior performance on the LOONG benchmark with significantly reduced API costs compared to standard RAG and full-context baselines.

Yagiz Can Akay, Muhammed Yusuf Kartal, Esra Alparslan, Faruk Ortakoyluoglu, Arda Akpinar2026-03-10💬 cs.CL

Rethinking Attention Output Projection: Structured Hadamard Transforms for Efficient Transformers

This paper proposes replacing the dense output projection in multi-head attention with a parameter-free Walsh-Hadamard Transform and lightweight affine rescaling, achieving significant reductions in parameters, memory, and inference latency while maintaining or improving model performance across various benchmarks.

Shubham Aggarwal, Lokendra Kumar2026-03-10🤖 cs.LG

Do Language Models Know Theo Has a Wife? Investigating the Proviso Problem

This paper introduces a diagnostic dataset and evaluation framework to investigate how language models handle the proviso problem in pragmatics, revealing that while models align with human judgments, they rely on shallow pattern matching rather than genuine semantic or pragmatic reasoning.

Tara Azin, Daniel Dumitrescu, Diana Inkpen, Raj Singh2026-03-10💬 cs.CL

Computational modeling of early language learning from acoustic speech and audiovisual input without linguistic priors

This chapter reviews recent computational models demonstrating that self-supervised and visually grounded learning principles can effectively explain early language acquisition from acoustic and audiovisual speech without relying on strong linguistic priors.

Okko Räsänen2026-03-10💬 cs.CL

← Previous Next →