cs.IR papers | Gist.Science

Beyond Relevance: On the Relationship Between Retrieval and RAG Information Coverage

This paper empirically demonstrates that coverage-based retrieval metrics serve as reliable early indicators of information coverage in RAG-generated responses, particularly when retrieval objectives align with generation goals, across diverse text and multimodal benchmarks.

Saron Samuel, Alexander Martin, Eugene Yang, Andrew Yates, Dawn Lawrie, Ian Soborof, Laura Dietz, Benjamin Van DurmeWed, 11 Ma🤖 cs.AI

PRECEPT: Planning Resilience via Experience, Context Engineering & Probing Trajectories A Unified Framework for Test-Time Adaptation with Compositional Rule Learning and Pareto-Guided Prompt Evolution

PRECEPT is a unified test-time adaptation framework that enhances LLM agent resilience by integrating deterministic exact-match rule retrieval, conflict-aware memory with Bayesian reliability, and the Pareto-guided COMPASS prompt-evolution loop to achieve superior compositional generalization, continuous learning, and robustness against knowledge drift and adversarial inputs.

Arash ShahmansooriWed, 11 Ma🤖 cs.AI

DataFactory: Collaborative Multi-Agent Framework for Advanced Table Question Answering

This paper introduces DataFactory, a collaborative multi-agent framework that overcomes the context, hallucination, and reasoning limitations of existing TableQA systems by orchestrating specialized agents for structured and relational reasoning, thereby achieving significant accuracy improvements across multiple benchmarks.

Tong Wang, Chi Jin, Yongkang Chen, Huan Deng, Xiaohui Kuang, Gang ZhaoWed, 11 Ma🤖 cs.AI

A Consensus-Driven Multi-LLM Pipeline for Missing-Person Investigations

This paper introduces Guardian, a consensus-driven, multi-LLM pipeline enhanced by QLoRA fine-tuning that coordinates specialized models and a consensus engine to perform auditable, structured information extraction for critical missing-person investigations while avoiding unconstrained decision-making.

Joshua Castillo, Ravi MukkamalaWed, 11 Ma🤖 cs.AI

Interpretable Markov-Based Spatiotemporal Risk Surfaces for Missing-Child Search Planning with Reinforcement Learning and LLM-Based Quality Assurance

The paper presents "Guardian," an interpretable, three-layer decision-support system that combines Markov chains, reinforcement learning, and LLM-based validation to generate dynamic, probabilistic search plans for missing-child investigations within the critical first 72 hours.

Joshua Castillo, Ravi MukkamalaWed, 11 Ma🤖 cs.AI

ERASE -- A Real-World Aligned Benchmark for Unlearning in Recommender Systems

This paper introduces ERASE, a large-scale benchmark designed to align machine unlearning evaluation with real-world recommender system constraints by covering diverse tasks, datasets, and algorithms to systematically analyze the effectiveness and robustness of current unlearning methods.

Pierre Lubitzsch, Maarten de Rijke, Sebastian SchelterTue, 10 Ma💻 cs

UIS-Digger: Towards Comprehensive Research Agent Systems for Real-world Unindexed Information Seeking

This paper identifies the critical limitation of current LLM-based agents in accessing unindexed information, introduces the first dedicated UIS-QA benchmark to quantify this challenge, and proposes UIS-Digger, a multi-agent framework that significantly outperforms state-of-the-art models by effectively combining dual-mode browsing and file parsing to retrieve vital unindexed data.

Chang Liu, Chuqiao Kuang, Tianyi Zhuang, Yuxin Cheng, Huichi Zhou, Xiaoguang Li, Lifeng ShangTue, 10 Ma💻 cs

Why Large Language Models can Secretly Outperform Embedding Similarity in Information Retrieval

Although the study finds that Large Language Model-based relevance judgment systems do not outperform embedding-based retrieval on standard TREC-DL 2019 benchmarks due to the short-sightedness inherent in human annotations, it argues that these models possess the theoretical capability to surpass embedding methods by better understanding relevance through reasoning.

Matei Benescu, Ivo Pascal de JongTue, 10 Ma💻 cs

Structure-Preserving Graph Contrastive Learning for Mathematical Information Retrieval

This paper proposes Variable Substitution, a domain-specific graph augmentation technique that preserves the structural and semantic integrity of mathematical formulas, significantly enhancing the performance of graph contrastive learning models for mathematical information retrieval compared to generic strategies.

Chun-Hsi Ku, Hung-Hsuan ChenTue, 10 Ma💻 cs

Verifiable Reasoning for LLM-based Generative Recommendation

This paper proposes VRec, a novel "reason-verify-recommend" paradigm that interleaves reasoning with multi-dimensional verification to mitigate reasoning degradation and enhance the effectiveness and scalability of LLM-based generative recommendation.

Xinyu Lin, Hanqing Zeng, Hanchao Yu, Yinglong Xia, Jiang Zhang, Aashu Singh, Fei Liu, Wenjie Wang, Fuli Feng, Tat-Seng Chua, Qifan WangTue, 10 Ma💻 cs

Deep Research for Recommender Systems

This paper introduces RecPilot, a multi-agent framework that shifts the recommender system paradigm from passive item filtering to proactive, user-centric assistance by generating comprehensive, synthesized reports that significantly reduce user effort in item evaluation.

Kesha Ou, Chenghao Wu, Xiaolei Wang, Bowen Zheng, Wayne Xin Zhao, Weitao Li, Long Zhang, Sheng Chen, Ji-Rong WenTue, 10 Ma💻 cs

GP-Tree: An in-memory spatial index combining adaptive grid cells with a prefix tree for efficient spatial querying

The paper proposes GP-Tree, a novel in-memory spatial index that combines adaptive grid cells with a prefix tree structure to replace coarse minimum bounding rectangles with fine-grained approximations, thereby significantly improving filtering accuracy and query performance for complex spatial objects compared to traditional indexes.

Xiangyang Yang, Xuefeng Guan, Lanxue Dang, Yi Xie, Qingyang Xu, Huayi Wu, Jiayao WangTue, 10 Ma💻 cs

SeDa: A Unified System for Dataset Discovery and Multi-Entity Augmented Semantic Exploration

SeDa is a unified framework that aggregates over 7.6 million datasets from more than 200 platforms to enable trustworthy, semantically enriched, and multi-entity augmented exploration through standardized metadata, a dynamic tag graph, and provenance assurance.

Kan Ling, Zhen Qin, Yichi Zhu, Hengrun Zhang, Huiqun Yu, Guisheng FanTue, 10 Ma💻 cs

Do Deployment Constraints Make LLMs Hallucinate Citations? An Empirical Study across Four Models and Five Prompting Regimes

This empirical study demonstrates that deployment-motivated prompting constraints significantly exacerbate citation hallucinations across four large language models, with no model achieving a citation existence rate above 47.5% and a substantial portion of unverifiable outputs being fabricated, thereby underscoring the critical need for post-hoc verification in academic and software engineering contexts.

Chen Zhao, Yuan Tang, Yitian QianTue, 10 Ma💻 cs

AutoDataset: A Lightweight System for Continuous Dataset Discovery and Search

AutoDataset is a lightweight, automated system that continuously monitors arXiv to detect, extract, and index newly released datasets from research papers, enabling real-time discovery and significantly improving search efficiency by up to 80%.

Junzhe Yang, Xinghao Chen, Yunuo Liu, Zhijing Sun, Wenjin Guo, Xiaoyu ShenTue, 10 Ma💻 cs

Detecting Cryptographically Relevant Software Packages with Collaborative LLMs

This paper proposes and evaluates an on-premises collaborative framework utilizing multiple large language models with majority voting to efficiently and privately identify cryptographically relevant software packages, thereby addressing the challenges of manual inventory and static analysis limitations in the transition to post-quantum cryptography.

Eduard Hirsch, Kristina Raab, Tobias J. Bauer, Daniel LoebenbergerTue, 10 Ma💻 cs

Retrieving Minimal and Sufficient Reasoning Subgraphs with Graph Foundation Models for Path-aware GraphRAG

This paper introduces GFM-Retriever, a novel GraphRAG framework that leverages a pre-trained Graph Foundation Model for cross-domain subgraph retrieval and an Information Bottleneck-based selector to extract minimal, sufficient reasoning paths, thereby achieving state-of-the-art performance in multi-hop question answering without relying on domain-specific heuristics.

Haonan Yuan, Qingyun Sun, Junhua Shi, Mingjun Liu, Jiaqi Yuan, Ziwei Zhang, Xingcheng Fu, Jianxin LiTue, 10 Ma💻 cs

Efficient Personalized Reranking with Semi-Autoregressive Generation and Online Knowledge Distillation

This paper proposes the Personalized Semi-Autoregressive with online knowledge Distillation (PSAD) framework, which utilizes a semi-autoregressive teacher model and a User Profile Network to balance generation quality with low-latency inference while enhancing user-item interactions, thereby outperforming state-of-the-art baselines in both ranking performance and efficiency.

Kai Cheng, Hao Wang, Wei Guo, Weiwen Liu, Yong Liu, Yawen Li, Enhong ChenTue, 10 Ma💻 cs

Multi-TAP: Multi-criteria Target Adaptive Persona Modeling for Cross-Domain Recommendation

The paper proposes Multi-TAP, a multi-criteria target-adaptive persona framework that addresses data sparsity and intra-domain heterogeneity in cross-domain recommendation by explicitly modeling semantic personas and selectively transferring relevant source-domain signals, thereby outperforming state-of-the-art methods on real-world datasets.

Daehee Kang, Yeon-Chang LeeTue, 10 Ma💻 cs

Leveraging Large Language Models for Automated Scalable Development of Open Scientific Databases

This paper introduces a scalable, domain-agnostic web-based framework that leverages Large Language Models to automate the collection, filtering, and construction of open scientific databases, achieving 90% overlap with expert-curated datasets while significantly reducing manual workload.

Nikita Gautam, Doina Caragea, Ignacio Ciampitti, Federico GomezTue, 10 Ma💻 cs

← Previous Next →