cs.AI papers | Gist.Science

$\textbf{Re}^{2}$ : Unlocking LLM Reasoning via Reinforcement Learning with Re-solving

The paper introduces Re $^2$ , a reinforcement learning framework that enables large language models to dynamically abandon unproductive reasoning paths and restart their solution process, thereby significantly improving reasoning efficiency and performance without requiring preliminary supervised fine-tuning.

Pinzheng Wang, Shuli Xu, Juntao Li, Yu Luo, Dong Li, Jianye Hao, Min Zhang2026-03-10💻 cs

A Miniature Brain Transformer: Thalamic Gating, Hippocampal Lateralization, Amygdaloid Salience, and Prefrontal Working Memory in Attention-Coupled Latent Memory

This paper introduces a miniature brain transformer architecture that demonstrates a novel, falsifiable prediction: functional lateralization of hippocampal banks requires the synergistic interaction of a prefrontal working-memory buffer (acting as a symmetry-breaker) and inhibitory callosal coupling, a mechanism that triggers a sharp phase transition in memory performance while a cerebellar fast-path merely accelerates convergence.

Hong Jeong2026-03-10💻 cs

VINO: Video-driven Invariance for Non-contextual Objects via Structural Prior Guided De-contextualization

VINO is a self-supervised learning framework that overcomes the "co-occurrence trap" in dense video by using a teacher-student distillation approach with structural priors to force representations to focus on foreground objects rather than background context, achieving state-of-the-art unsupervised object discovery performance.

Seul-Ki Yeom, Marcel Simon, Eunbin Lee, Tae-Ho Kim2026-03-10💻 cs

A Hybrid LTR-based System via Social Context Embedding for Recommending Solutions of Software Bugs in Developer Communities

This paper proposes a hybrid Learning-to-Rank recommender system that leverages deep learning and social context embeddings from Stack Overflow to effectively recommend software bug solutions to developers, achieving nearly 78% accuracy in the top 10 results.

Fouzi Harrag, Mokdad Khemliche2026-03-10💻 cs

LEPA: Learning Geometric Equivariance in Satellite Remote Sensing Data with a Predictive Architecture

This paper introduces LEPA, a learned architecture that conditions on geometric augmentations to accurately predict transformed satellite image embeddings, effectively overcoming the limitations of standard interpolation in non-convex geospatial foundation model manifolds and significantly improving geometric adjustment performance.

Erik Scheurer, Rocco Sedona, Stefan Kesselheim, Gabriele Cavallaro2026-03-10💻 cs

Learning When to Cooperate Under Heterogeneous Goals

This paper addresses the challenge of agents with heterogeneous goals deciding when to cooperate or act alone by introducing a hierarchical learning framework that combines imitation and reinforcement learning, demonstrating superior performance over baselines and revealing that modeling teammates is most beneficial when their goals are less observable.

Max Taylor-Davies, Neil Bramley, Christopher G. Lucas2026-03-10💻 cs

Kinematics-Aware Latent World Models for Data-Efficient Autonomous Driving

This paper proposes a kinematics-aware latent world model that integrates vehicle kinematic information and geometry-aware supervision into the Recurrent State-Space Model (RSSM) to enhance spatial representation and long-horizon imagination fidelity, thereby achieving more data-efficient and stable autonomous driving policy learning compared to existing baselines.

Jiazhuo Li, Linjiang Cao, Qi Liu, Xi Xiong2026-03-10💻 cs

VisualDeltas: Learning Preferences from Visual Quality Perturbations

VisualDeltas is a lightweight, label-free preference-learning framework that leverages systematic visual quality perturbations to generate informative supervision signals, thereby improving multimodal model performance and generalization without relying on human annotations.

Hailiang Huang, Yihao Liu, Shengyue Guan, Haoze Li, Sujian Li2026-03-10💻 cs

Do Deployment Constraints Make LLMs Hallucinate Citations? An Empirical Study across Four Models and Five Prompting Regimes

This empirical study demonstrates that deployment-motivated prompting constraints significantly exacerbate citation hallucinations across four large language models, with no model achieving a citation existence rate above 47.5% and a substantial portion of unverifiable outputs being fabricated, thereby underscoring the critical need for post-hoc verification in academic and software engineering contexts.

Chen Zhao, Yuan Tang, Yitian Qian2026-03-10💻 cs

MAviS: A Multimodal Conversational Assistant For Avian Species

This paper introduces MAviS, a domain-adaptive multimodal conversational assistant for avian species that leverages the newly created MAviS-Dataset and is evaluated on the MAviS-Bench to achieve state-of-the-art performance in fine-grained bird species understanding and multimodal question answering.

Yevheniia Kryklyvets, Mohammed Irfan Kurpath, Sahal Shaji Mullappilly, Jinxing Zhou, Fahad Shabzan Khan, Rao Anwer, Salman Khan, Hisham Cholakkal2026-03-10💻 cs

A Cortically Inspired Architecture for Modular Perceptual AI

This paper proposes a modular, cortically inspired architecture for perceptual AI that leverages neuroscientific principles like predictive processing and specialized modules to overcome the interpretability and generalization limitations of current monolithic models, thereby enabling more transparent and human-aligned reasoning.

Prerna Luthra2026-03-10💻 cs

Spectral Discovery of Continuous Symmetries via Generalized Fourier Transforms

This paper proposes a novel framework for discovering continuous one-parameter symmetries by leveraging the Generalized Fourier Transform to detect structured sparsity patterns in the spectral domain, offering a principled and interpretable alternative to existing generator-based optimization methods.

Pavan Karjol, Kumar Shubham, Prathosh AP2026-03-10🤖 cs.LG

Data-Driven Hints in Intelligent Tutoring Systems

This chapter reviews the evolution of data-driven hint generation in intelligent tutoring systems, highlighting how historical student data enables the creation of next-step hints, strategic subgoals, and timely interventions, while also exploring future adaptations involving behavioral data and Large Language Models.

Sutapa Dey Tithi, Kimia Fazeli, Dmitri Droujkov, Tahreem Yasir, Xiaoyi Tian, Tiffany Barnes2026-03-10💻 cs

Adversarial Latent-State Training for Robust Policies in Partially Observable Domains

This paper introduces an adversarial latent-initial-state POMDP framework that theoretically establishes a minimax principle and finite-sample guarantees, while empirically demonstrating that targeted adversarial training significantly reduces robustness gaps in partially observable reinforcement learning.

Angad Singh Ahuja2026-03-10🤖 cs.LG

Shutdown Safety Valves for Advanced AI

This paper explores the unorthodox proposal of programming advanced AI systems with a primary goal of being turned off to mitigate the risk of them resisting shutdown, while analyzing the conditions under which such an approach would be effective.

Vincent Conitzer2026-03-10🤖 cs.LG

FinSheet-Bench: From Simple Lookups to Complex Reasoning, Where LLMs Break on Financial Spreadsheets

FinSheet-Bench introduces a synthetic benchmark modeled on real private equity fund structures to evaluate LLMs on financial spreadsheet tasks, revealing that even the best-performing models currently lack the accuracy required for unsupervised professional use, particularly on complex, large-scale documents, and suggesting that reliable extraction will require separating document understanding from deterministic computation.

Jan Ravnik, Matjaž Ličen, Felix Bührmann, Bithiah Yuan, Felix Stinson, Tanvi Singh2026-03-10💻 cs

Norm-Hierarchy Transitions in Representation Learning: When and Why Neural Networks Abandon Shortcuts

This paper introduces the Norm-Hierarchy Transition (NHT) framework, which explains that neural networks delay learning structured representations in favor of spurious shortcuts because weight decay slowly drives the model from high-norm solutions to lower-norm ones, with the transition delay logarithmically scaling to the ratio between these norms.

Truong Xuan Khanh, Truong Quynh Hoa2026-03-10🤖 cs.LG

The Third Ambition: Artificial Intelligence and the Science of Human Behavior

This paper proposes a "third ambition" for artificial intelligence research, advocating for the use of large language models as scientific instruments to study human behavior, culture, and moral reasoning by treating them as computationally accessible condensates of collective discourse while addressing their methodological and epistemic limitations.

W. Russell Neuman, Chad Coleman2026-03-10💬 cs.CL

VisualScratchpad: Inference-time Visual Concepts Analysis in Vision Language Models

This paper introduces VisualScratchpad, an interactive inference-time analysis tool that leverages sparse autoencoders and attention mechanisms to visualize and debug vision language models by linking visual concepts to text tokens, thereby revealing previously underexplored failure modes such as limited cross-modal alignment and misleading visual concepts.

Hyesu Lim, Jinho Choi, Taekyung Kim, Byeongho Heo, Jaegul Choo, Dongyoon Han2026-03-10💻 cs

Agora: Teaching the Skill of Consensus-Finding with AI Personas Grounded in Human Voice

The paper introduces Agora, an AI-powered platform that leverages LLMs to simulate diverse human perspectives on policy issues, enabling users to practice consensus-building and demonstrating through a preliminary study that access to authentic voice explanations significantly enhances problem-solving skills and the quality of collective decisions compared to viewing aggregate data alone.

Suyash Fulay, Prerna Ravi, Emily Kubin, Shrestha Mohanty, Michiel Bakker, Deb Roy2026-03-10💻 cs

← Previous Next →

cs.AI

Re2\textbf{Re}^{2}Re2: Unlocking LLM Reasoning via Reinforcement Learning with Re-solving