Towards Robust Retrieval-Augmented Generation Based on Knowledge Graph: A Comparative Analysis

This paper presents a comparative analysis demonstrating that GraphRAG, a knowledge graph-based retrieval system with specific customizations, outperforms the standard RGB baseline in robustness across noise, integration, negative rejection, and counterfactual scenarios, offering valuable insights for building more reliable Retrieval-Augmented Generation systems.

Hazem Amamou, Stéphane Gagnon, Alan Davoust, Anderson R. Avila2026-03-09💬 cs.CL

Cultural Perspectives and Expectations for Generative AI: A Global Survey Approach

This paper presents findings from a large-scale global survey that explores diverse cultural perspectives on Generative AI, distilling community-defined understandings of culture to propose recommendations for more inclusive and sensitive AI development, including participatory approaches and frameworks for addressing cultural boundaries.

Erin van Liemt, Renee Shelby, Andrew Smart, Sinchana Kumbale, Richard Zhang, Neha Dixit, Qazi Mamunur Rashid, Jamila Smith-Loud2026-03-09🤖 cs.AI

Structured Multidimensional Representation Learning for Large Language Models

This paper introduces the L-Transformer, a novel architecture that utilizes structured spectral factorization via the L-product to decompose the embedding space into independent spectral sub-transformers, achieving significant parameter reduction (up to 75%) while maintaining competitive performance and introducing beneficial frequency-based inductive biases.

Alaa El Ichi, Khalide Jbilou, Mohamed El Guide, Franck Dufrenois2026-03-09💬 cs.CL

CodeScout: Contextual Problem Statement Enhancement for Software Agents

The paper introduces CodeScout, a framework that enhances software agent performance by performing lightweight pre-exploration of codebases to convert underspecified user requests into comprehensive, actionable problem statements, resulting in a 20% improvement in resolution rates on the SWEBench-Verified benchmark.

Manan Suri, Xiangci Li, Mehdi Shojaie, Songyang Han, Chao-Chun Hsu, Shweta Garg, Aniket Anand Deshmukh, Varun Kumar2026-03-09💬 cs.CL

NERdME: a Named Entity Recognition Dataset for Indexing Research Artifacts in Code Repositories

The paper introduces NERdME, a new dataset of 200 manually annotated README files containing over 10,000 labeled spans across 10 entity types, designed to bridge the gap in scholarly information extraction by enabling the automatic indexing of implementation-level research artifacts in code repositories.

Genet Asefa Gesese, Zongxiong Chen, Shufan Jiang, Mary Ann Tan, Zhaotai Liu, Sonja Schimmler, Harald Sack2026-03-09💬 cs.CL

PVminerLLM: Structured Extraction of Patient Voice from Patient-Generated Text using Large Language Models

This paper introduces PVminer, a benchmark for structured extraction of patient voice from patient-generated text, and presents PVminerLLM, a supervised fine-tuned large language model that significantly outperforms prompt-based baselines in extracting codes, sub-codes, and evidence spans to enable scalable analysis of non-clinical health drivers.

Samah Fodeh, Linhai Ma, Ganesh Puthiaraju, Srivani Talakokkul, Afshan Khan, Ashley Hagaman, Sarah Lowe, Aimee Roundtree2026-03-09🤖 cs.AI

RouteGoT: Node-Adaptive Routing for Cost-Efficient Graph of Thoughts Reasoning

RouteGoT is a budget-controllable, node-adaptive routing framework that optimizes Graph of Thoughts reasoning by dynamically assigning strong models to critical planning and synthesis tasks while utilizing lightweight models for easier subtasks, thereby significantly improving accuracy and reducing token consumption compared to existing methods.

Yuhang Liu, Ruijie Wang, Yunlong Chu, Bing Hao, Yumeng Lin, Shengzhong Liu, Minglai Shao2026-03-09💬 cs.CL

Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls

This paper empirically evaluates the effectiveness and limitations of many-shot prompting for test-time adaptation in large language models, finding that while it benefits structured tasks with high information gain, its performance is highly sensitive to selection strategies and often yields limited improvements for open-ended generation.

Shubhangi Upasani, Chen Wu, Jay Rainton, Bo Li, Changran Hu, Qizheng Zhang, Urmish Thakker2026-03-09🤖 cs.LG

ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning

ReflexiCoder is a novel reinforcement learning framework that internalizes structured self-reflection and self-correction capabilities into an LLM's weights, enabling it to autonomously generate, debug, and optimize code without external feedback while achieving state-of-the-art performance and improved token efficiency across multiple benchmarks.

Juyong Jiang, Jiasi Shen, Sunghun Kim, Kang Min Yoo, Jeonghoon Kim, Sungju Kim2026-03-09🤖 cs.LG

Confidence Before Answering: A Paradigm Shift for Efficient LLM Uncertainty Estimation

This paper introduces CoCA, a reinforcement learning framework that shifts the paradigm from answer-first to confidence-first by jointly optimizing a model's pre-answer confidence calibration and answer accuracy through segmented credit assignment, thereby enabling more reliable uncertainty estimation without compromising performance.

Changcheng Li, Jiancan Wu, Hengheng Zhang, Zhengsu Chen, Guo An, Junxiang Qiu, Xiang Wang, Qi Tian2026-03-09💬 cs.CL

Learning Next Action Predictors from Human-Computer Interaction

This paper introduces LongNAP, a user model that leverages a large-scale dataset of 360K annotated multimodal interactions and a hybrid parametric-in-context learning approach to significantly outperform existing baselines in predicting a user's next action by reasoning over their full interaction history.

Omar Shaikh, Valentin Teutschbein, Kanishk Gandhi, Yikun Chi, Nick Haber, Thomas Robinson, Nilam Ram, Byron Reeves, Sherry Yang, Michael S. Bernstein, Diyi Yang2026-03-09💬 cs.CL