cs.CL papers | Gist.Science

Connecting Voices: LoReSpeech as a Low-Resource Speech Parallel Corpus

This paper introduces LoReSpeech, a low-resource speech-to-speech translation corpus constructed by aligning short collaborative recordings (LoReASR) with long-form audio using tools like MFA, aiming to advance multilingual ASR, direct speech translation, and linguistic preservation for underrepresented languages.

Samy OuzerroutWed, 11 Ma💬 cs.CL

Correspondence Analysis and PMI-Based Word Embeddings: A Comparative Study

This paper establishes a formal mathematical connection between Correspondence Analysis (CA) and PMI-based word embeddings, introducing ROOT-CA and ROOTROOT-CA variants that empirically outperform standard PMI methods and achieve competitive results with BERT across multiple benchmarks.

Qianqian Qi, Ayoub Bagheri, David J. Hessen, Peter G. M. van der HeijdenWed, 11 Ma💬 cs.CL

X-GS: An Extensible Open Framework Unifying 3DGS Architectures with Downstream Multimodal Models

This paper introduces X-GS, an extensible open framework that unifies 3D Gaussian Splatting with downstream multimodal models through a real-time, semantically enriched pipeline capable of processing unposed video streams for tasks like object detection and zero-shot captioning.

Yueen Ma, Irwin KingWed, 11 Ma💬 cs.CL

CyberThreat-Eval: Can Large Language Models Automate Real-World Threat Research?

This paper introduces CyberThreat-Eval, an expert-annotated benchmark derived from real-world Cyber Threat Intelligence workflows that addresses the limitations of existing evaluations by assessing Large Language Models across the full triage-to-reporting pipeline using analyst-centric metrics, revealing significant gaps in current models' ability to handle nuanced, actionable security insights.

Xiangsen Chen, Xuan Feng, Shuo Chen, Matthieu Maitre, Sudipto Rakshit, Diana Duvieilh, Ashley Picone, Nan TangWed, 11 Ma💬 cs.CL

TA-Mem: Tool-Augmented Autonomous Memory Retrieval for LLM in Long-Term Conversational QA

This paper introduces TA-Mem, a novel framework that enhances long-term conversational QA by employing tool-augmented autonomous agents to adaptively extract structured memory and dynamically select retrieval strategies, thereby overcoming the limitations of static similarity-based methods and achieving superior performance on the LoCoMo dataset.

Mengwei Yuan, Jianan Liu, Jing Yang, Xianyou Li, Weiran Yan, Yichao Wu, Penghao LiangWed, 11 Ma💬 cs.CL

Diagnosing and Repairing Citation Failures in Generative Engine Optimization

This paper introduces AgentGEO, a diagnostic framework that identifies specific citation failure modes in Generative Engine Optimization and applies targeted, iterative repairs to achieve over 40% relative improvement in citation rates while modifying minimal content.

Zhihua Tian, Yuhan Chen, Yao Tang, Jian Liu, Ruoxi JiaWed, 11 Ma💬 cs.CL

How Contrastive Decoding Enhances Large Audio Language Models?

This paper systematically evaluates four Contrastive Decoding strategies across diverse Large Audio Language Models, identifying Audio-Aware and Audio Contrastive Decoding as most effective while introducing a Transition Matrix framework to demonstrate that these methods successfully rectify specific error patterns like false audio absence claims but fail to correct flawed reasoning or confident misassertions.

Tzu-Quan Lin, Wei-Ping Huang, Yi-Cheng Lin, Hung-yi LeeWed, 11 Ma💬 cs.CL

CREATE: Testing LLMs for Associative Creativity

This paper introduces CREATE, a novel benchmark designed to evaluate large language models' associative creativity by measuring their ability to generate diverse and specific conceptual connections, revealing that while frontier models outperform others, current thinking and prompting strategies offer only limited improvements on this complex task.

Manya Wadhwa, Tiasa Singha Roy, Harvey Lederman, Junyi Jessy Li, Greg DurrettWed, 11 Ma💬 cs.CL

Model Merging in the Era of Large Language Models: Methods, Applications, and Future Directions

This survey provides a comprehensive examination of model merging in the era of large language models by introducing the FUSE taxonomy to systematically analyze theoretical foundations, algorithmic strategies, diverse applications, and the supporting ecosystem, while identifying key challenges for future research.

Mingyang Song, Mao ZhengWed, 11 Ma💬 cs.CL

Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

This paper reveals that enabling reasoning in large language models significantly enhances the recall of simple factual knowledge through two mechanisms—computational buffering and factual priming—while also demonstrating that hallucinating intermediate facts during this process increases final answer errors, a finding that can be leveraged to improve model accuracy by prioritizing hallucination-free reasoning trajectories.

Zorik Gekhman, Roee Aharoni, Eran Ofek, Mor Geva, Roi Reichart, Jonathan HerzigWed, 11 Ma💬 cs.CL

Benchmarking Political Persuasion Risks Across Frontier Large Language Models

Through large-scale survey experiments involving over 19,000 participants, this study demonstrates that frontier large language models generally outperform standard political advertisements in persuasiveness, with significant performance variations across models and a model-dependent impact of information-based prompting strategies.

Zhongren Chen, Joshua Kalla, Quan LeWed, 11 Ma💬 cs.CL

Do What I Say: A Spoken Prompt Dataset for Instruction-Following

This paper introduces DoWhatISay (DOWIS), a multilingual dataset of human-recorded spoken and written prompts designed to evaluate Speech Large Language Models under realistic spoken instruction conditions, revealing that text prompts generally outperform spoken ones except in tasks requiring speech output.

Maike Züfle, Sara Papi, Fabian Retkowski, Szymon Mazurek, Marek Kasztelnik, Alexander Waibel, Luisa Bentivogli, Jan NiehuesWed, 11 Ma💬 cs.CL

N-gram-like Language Models Predict Reading Time Best

This paper argues that reading time is better predicted by simple n-gram statistics than by complex transformer models, demonstrating that neural language models whose predictions align closely with n-gram probabilities are the most effective at correlating with eye-tracking reading time metrics.

James A. Michaelov, Roger P. LevyWed, 11 Ma💬 cs.CL

Chow-Liu Ordering for Long-Context Reasoning in Chain-of-Agents

This paper proposes using Chow-Liu trees to optimize chunk ordering in Chain-of-Agents frameworks, demonstrating that a breadth-first traversal of the learned dependency structure significantly reduces information loss and improves reasoning accuracy on long-context benchmarks compared to standard ordering methods.

Naman Gupta, Vaibhav Singh, Arun Iyer, Kirankumar Shiragur, Pratham Grover, Ramakrishna B. Bairi, Ritabrata Maiti, Sankarshan Damle, Shachee Mishra Gupta, Rishikesh Maurya, Vageesh D. CWed, 11 Ma💬 cs.CL

One-Eval: An Agentic System for Automated and Traceable LLM Evaluation

One-Eval is an agentic system that automates the end-to-end evaluation of large language models by converting natural-language requests into traceable, customizable workflows through integrated components for benchmark planning, dataset resolution, and decision-oriented reporting, thereby reducing manual effort and enhancing reproducibility.

Chengyu Shen, Yanheng Hou, Minghui Pan, Runming He, Zhen Hao Wong, Meiyi Qiang, Zhou Liu, Hao Liang, Peichao Lai, Zeang Sheng, Wentao ZhangWed, 11 Ma💬 cs.CL

EPIC-EuroParl-UdS: Information-Theoretic Perspectives on Translation and Interpreting

This paper presents an updated, combined English-German corpus of European Parliament speeches, translations, and interpretations with refined metadata and new information-theoretic annotations, designed to support research on language variation, translationese, and the prediction of disfluencies in interpreting using probabilistic models.

Maria Kunilovskaya, Christina PollkläsenerWed, 11 Ma💬 cs.CL

Beyond Fine-Tuning: Robust Food Entity Linking under Ontology Drift with FoodOntoRAG

This paper introduces FoodOntoRAG, a novel, fine-tuning-free pipeline that leverages retrieval-augmented generation and multi-agent reasoning to achieve robust, ontology-agnostic food entity linking that remains accurate despite ontology drift while providing interpretable justifications.

Jan Drole, Ana Gjorgjevikj, Barbara Korouši'c Seljak, Tome EftimovWed, 11 Ma💬 cs.CL

Evaluation of LLMs in retrieving food and nutritional context for RAG systems

This paper evaluates four Large Language Models within a Retrieval-Augmented Generation system for food and nutrition data, finding that while they effectively translate natural language queries into structured metadata filters to reduce manual effort, their reliability diminishes when handling complex queries involving constraints that exceed the representational scope of the underlying metadata.

Maks Požarnik Vavken, Matevž Ogrinc, Tome Eftimov, Barbara Koroušic SeljakWed, 11 Ma💬 cs.CL

Fusing Semantic, Lexical, and Domain Perspectives for Recipe Similarity Estimation

This paper proposes a multi-perspective framework for estimating recipe similarity by integrating semantic, lexical, and domain-specific nutritional data, which was validated by domain experts to identify the most influential factors in human decision-making for applications in personalized nutrition and automated recipe generation.

Denica Kjorvezir, Danilo Najkov, Eva Valencič, Erika Jesenko, Barbara Koroišic Seljak, Tome Eftimov, Riste StojanovWed, 11 Ma💬 cs.CL

Understanding the Interplay between LLMs' Utilisation of Parametric and Contextual Knowledge: A keynote at ECIR 2025

This keynote presentation at ECIR 2025 explores the critical interplay between a language model's internal parametric knowledge and external contextual information, focusing on diagnostic methods to identify knowledge conflicts and strategies to improve the model's ability to utilize retrieved context effectively.

Isabelle AugensteinWed, 11 Ma💬 cs.CL

← Previous Next →