PRECEPT: Planning Resilience via Experience, Context Engineering & Probing Trajectories A Unified Framework for Test-Time Adaptation with Compositional Rule Learning and Pareto-Guided Prompt Evolution

PRECEPT is a unified test-time adaptation framework that enhances LLM agent resilience by integrating deterministic exact-match rule retrieval, conflict-aware memory with Bayesian reliability, and the Pareto-guided COMPASS prompt-evolution loop to achieve superior compositional generalization, continuous learning, and robustness against knowledge drift and adversarial inputs.

Arash ShahmansooriWed, 11 Ma🤖 cs.AI

DataFactory: Collaborative Multi-Agent Framework for Advanced Table Question Answering

This paper introduces DataFactory, a collaborative multi-agent framework that overcomes the context, hallucination, and reasoning limitations of existing TableQA systems by orchestrating specialized agents for structured and relational reasoning, thereby achieving significant accuracy improvements across multiple benchmarks.

Tong Wang, Chi Jin, Yongkang Chen, Huan Deng, Xiaohui Kuang, Gang ZhaoWed, 11 Ma🤖 cs.AI

UIS-Digger: Towards Comprehensive Research Agent Systems for Real-world Unindexed Information Seeking

This paper identifies the critical limitation of current LLM-based agents in accessing unindexed information, introduces the first dedicated UIS-QA benchmark to quantify this challenge, and proposes UIS-Digger, a multi-agent framework that significantly outperforms state-of-the-art models by effectively combining dual-mode browsing and file parsing to retrieve vital unindexed data.

Chang Liu, Chuqiao Kuang, Tianyi Zhuang, Yuxin Cheng, Huichi Zhou, Xiaoguang Li, Lifeng ShangTue, 10 Ma💻 cs

Why Large Language Models can Secretly Outperform Embedding Similarity in Information Retrieval

Although the study finds that Large Language Model-based relevance judgment systems do not outperform embedding-based retrieval on standard TREC-DL 2019 benchmarks due to the short-sightedness inherent in human annotations, it argues that these models possess the theoretical capability to surpass embedding methods by better understanding relevance through reasoning.

Matei Benescu, Ivo Pascal de JongTue, 10 Ma💻 cs

GP-Tree: An in-memory spatial index combining adaptive grid cells with a prefix tree for efficient spatial querying

The paper proposes GP-Tree, a novel in-memory spatial index that combines adaptive grid cells with a prefix tree structure to replace coarse minimum bounding rectangles with fine-grained approximations, thereby significantly improving filtering accuracy and query performance for complex spatial objects compared to traditional indexes.

Xiangyang Yang, Xuefeng Guan, Lanxue Dang, Yi Xie, Qingyang Xu, Huayi Wu, Jiayao WangTue, 10 Ma💻 cs

Do Deployment Constraints Make LLMs Hallucinate Citations? An Empirical Study across Four Models and Five Prompting Regimes

This empirical study demonstrates that deployment-motivated prompting constraints significantly exacerbate citation hallucinations across four large language models, with no model achieving a citation existence rate above 47.5% and a substantial portion of unverifiable outputs being fabricated, thereby underscoring the critical need for post-hoc verification in academic and software engineering contexts.

Chen Zhao, Yuan Tang, Yitian QianTue, 10 Ma💻 cs

Detecting Cryptographically Relevant Software Packages with Collaborative LLMs

This paper proposes and evaluates an on-premises collaborative framework utilizing multiple large language models with majority voting to efficiently and privately identify cryptographically relevant software packages, thereby addressing the challenges of manual inventory and static analysis limitations in the transition to post-quantum cryptography.

Eduard Hirsch, Kristina Raab, Tobias J. Bauer, Daniel LoebenbergerTue, 10 Ma💻 cs

Retrieving Minimal and Sufficient Reasoning Subgraphs with Graph Foundation Models for Path-aware GraphRAG

This paper introduces GFM-Retriever, a novel GraphRAG framework that leverages a pre-trained Graph Foundation Model for cross-domain subgraph retrieval and an Information Bottleneck-based selector to extract minimal, sufficient reasoning paths, thereby achieving state-of-the-art performance in multi-hop question answering without relying on domain-specific heuristics.

Haonan Yuan, Qingyun Sun, Junhua Shi, Mingjun Liu, Jiaqi Yuan, Ziwei Zhang, Xingcheng Fu, Jianxin LiTue, 10 Ma💻 cs

Efficient Personalized Reranking with Semi-Autoregressive Generation and Online Knowledge Distillation

This paper proposes the Personalized Semi-Autoregressive with online knowledge Distillation (PSAD) framework, which utilizes a semi-autoregressive teacher model and a User Profile Network to balance generation quality with low-latency inference while enhancing user-item interactions, thereby outperforming state-of-the-art baselines in both ranking performance and efficiency.

Kai Cheng, Hao Wang, Wei Guo, Weiwen Liu, Yong Liu, Yawen Li, Enhong ChenTue, 10 Ma💻 cs