cs.AI papers | Gist.Science

Toward Closed-loop Molecular Discovery via Language Model, Property Alignment and Strategic Search

The paper introduces Trio, a closed-loop molecular generation framework that integrates fragment-based language modeling, reinforcement learning, and Monte Carlo tree search to produce chemically valid, diverse, and pharmacologically optimized ligands with significantly improved binding affinity, drug-likeness, and synthetic accessibility compared to state-of-the-art methods.

Junkai Ji, Zhangfan Yang, Dong Xu, Ruibin Bai, Jianqiang Li, Tingjun Hou, Zexuan Zhu2026-03-12🤖 cs.AI

Maximum Risk Minimization with Random Forests

This paper introduces computationally efficient and statistically consistent Random Forest variants that minimize the maximum risk across diverse environments to improve out-of-distribution generalization, offering novel guarantees for mean squared error, negative reward, and regret-based risks.

Francesco Freni, Anya Fries, Linus Kühne, Markus Reichstein, Jonas Peters2026-03-12📊 stat

GTR-Turbo: Merged Checkpoint is Secretly a Free Teacher for Agentic VLM Training

GTR-Turbo is a highly efficient training method for multi-modal agents that eliminates the need for costly external teacher models by using merged checkpoints from ongoing reinforcement learning as a "free" teacher, thereby improving accuracy by 10–30% while reducing training time and compute costs by 50% and 60%, respectively.

Tong Wei, Yijun Yang, Changhao Zhang, Junliang Xing, Yuanchun Shi, Zongqing Lu, Deheng Ye2026-03-12🤖 cs.AI

Pretrained battery transformer (PBT): A foundation model for universal battery life prediction

This paper introduces the Pretrained Battery Transformer (PBT), a foundation model that leverages battery-knowledge-encoded mixture-of-experts layers to overcome data scarcity and heterogeneity, achieving state-of-the-art universal battery life prediction across diverse chemistries and conditions.

Ruifeng Tan, Weixiang Hong, Jia Li, Jiaqiang Huang, Tong-Yi Zhang2026-03-12🤖 cs.LG

Enhancing Tree Species Classification: Insights from YOLOv8 and Explainable AI Applied to TLS Point Cloud Projections

This paper presents a framework using YOLOv8 and Finer-CAM to achieve 96% accuracy in classifying seven European tree species from TLS 3D point clouds while demonstrating that the model's decisions are interpretable and primarily rely on crown features for most species, with stems playing a more significant role for others.

Adrian Straker, Paul Magdon, Marco Zullich, Maximilian Freudenberg, Christoph Kleinn, Johannes Breidenbach, Stefano Puliti, Nils Noelke2026-03-12🤖 cs.AI

The Bayesian Geometry of Transformer Attention

This paper introduces "Bayesian wind tunnels" to rigorously demonstrate that small transformers perform exact Bayesian inference through a specific geometric mechanism involving residual streams as belief substrates and attention-based routing, a capability that capacity-matched MLPs fundamentally lack.

Naman Agarwal, Siddhartha R. Dalal, Vishal Misra2026-03-12📊 stat

Gradient Dynamics of Attention: How Cross-Entropy Sculpts Bayesian Manifolds

This paper provides a first-order analysis demonstrating that cross-entropy training in transformers induces a coupled specialization of attention routing and value updates—functioning as a two-timescale EM procedure—that sculpts low-dimensional Bayesian manifolds, thereby explaining how gradient-based optimization enables precise probabilistic reasoning.

Naman Agarwal, Siddhartha R. Dalal, Vishal Misra2026-03-12📊 stat

Geometric Scaling of Bayesian Inference in LLMs

This paper demonstrates that production-grade language models preserve a low-dimensional geometric substrate, specifically an entropy-aligned axis in their last-layer value representations, which encodes Bayesian posterior structures and serves as a privileged readout for uncertainty, even though it is not a singular computational bottleneck for inference.

Naman Agarwal, Siddhartha R. Dalal, Vishal Misra2026-03-12🤖 cs.LG

Over-Searching in Search-Augmented Large Language Models

This paper systematically evaluates the phenomenon of "over-searching" in search-augmented large language models, where unnecessary tool invocation harms efficiency and accuracy, and proposes the Tokens Per Correctness (TPC) metric along with mitigation strategies to address this issue.

Roy Xie, Deepak Gopinath, David Qiu, Dong Lin, Haitian Sun, Saloni Potdar, Bhuwan Dhingra2026-03-12🤖 cs.LG

Burn-After-Use for Preventing Data Leakage through a Secure Multi-Tenant Architecture in Enterprise LLM

This paper proposes a Secure Multi-Tenant Architecture (SMTA) combined with a novel Burn-After-Use (BAU) mechanism to effectively prevent data leakage in enterprise LLMs by enforcing strict instance isolation and ephemeral context destruction, achieving high defense success rates against both semantic leakage attacks and post-session persistence threats in experimental evaluations.

Qiang Zhang, Elena Emma Wang, Jiaming Li, Xichun Wang2026-03-12🤖 cs.AI

Beyond Max Tokens: Stealthy Resource Amplification via Tool Calling Chains in LLM Agents

This paper introduces a stealthy, multi-turn denial-of-service attack on LLM agents via the Model Context Protocol (MCP) that exploits tool-calling chains to amplify computational costs and resource consumption by up to 658 times while evading standard detection mechanisms.

Kaiyu Zhou, Yongsen Zheng, Yicheng He, Meng Xue, Xueluan Gong, Yuji Wang, Xuanye Zhang, Kwok-Yan Lam2026-03-12🤖 cs.AI

Learning Transferable Skills in Action RPGs via Directed Skill Graphs and Selective Adaptation

This paper proposes a framework for lifelong learning in complex real-time environments like Dark Souls III by decomposing control into a directed skill graph of reusable components, which enhances sample efficiency and enables rapid adaptation to new challenges through selective fine-tuning of only the necessary skills.

Ali Najar2026-03-12🤖 cs.AI

MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning

MemOCR is a multimodal memory agent that enhances long-horizon reasoning under tight context budgets by converting structured rich-text history into a visually compressed image, allowing the agent to prioritize crucial evidence through layout-aware information density while aggressively reducing low-value details.

Yaorui Shi, Shugui Liu, Yu Yang, Wenyu Mao, Yuxin Chen, Qi GU, Hui Su, Xunliang Cai, Xiang Wang, An Zhang2026-03-12🤖 cs.AI

MHDash: An Online Platform for Benchmarking Mental Health-Aware AI Assistants

This paper introduces MHDash, an open-source platform that enables fine-grained, multi-turn evaluation of AI assistants for mental health support, revealing that conventional aggregate metrics fail to capture critical safety risks and that models often struggle specifically with high-risk cases despite strong overall performance.

Yihe Zhang, Cheyenne N Mohawk, Kaiying Han + 3 more2026-03-12🤖 cs.AI

Hallucination is a Consequence of Space-Optimality: A Rate-Distortion Theorem for Membership Testing

This paper establishes a rate-distortion theorem demonstrating that hallucinations in large language models are an inevitable consequence of information-theoretic optimal memory compression when storing sparse facts, forcing the model to confidently assign high scores to non-facts rather than abstain.

Anxin Guo, Jingwei Li2026-03-12💬 cs.CL

Evaluating Long-Horizon Memory for Multi-Party Collaborative Dialogues

This paper introduces EverMemBench, the first benchmark designed to evaluate long-horizon memory in multi-party collaborative dialogues, revealing that current LLM systems struggle with multi-hop reasoning, temporal versioning, and implicit relevance retrieval in realistic, complex interaction scenarios.

Chuanrui Hu, Tong Li, Xingze Gao, Hongda Chen, Yi Bai, Dannong Xu, Tianwei Lin, Xiaohong Li, Yunyun Han, Jian Pei, Yafeng Deng2026-03-12💬 cs.CL

Moving On, Even When You're Broken: Fail-Active Trajectory Generation via Diffusion Policies Conditioned on Embodiment and Task

This paper introduces DEFT, a diffusion-based trajectory generator that enables robots to achieve fail-active operation by successfully completing tasks under arbitrary actuation failures, outperforming classical methods in both simulation and real-world scenarios while demonstrating robust zero-shot generalization.

Gilberto G. Briscoe-Martinez, Yaashia Gautam, Rahul Shetty, Anuj Pasricha, Marco M. Nicotra, Alessandro Roncone2026-03-12🤖 cs.AI

DMS2F-HAD: A Dual-branch Mamba-based Spatial-Spectral Fusion Network for Hyperspectral Anomaly Detection

The paper proposes DMS2F-HAD, a dual-branch Mamba-based network that efficiently fuses spatial and spectral features to achieve state-of-the-art accuracy and significantly faster inference speeds for hyperspectral anomaly detection across multiple benchmark datasets.

Aayushma Pant, Lakpa Tamang, Tsz-Kwan Lee + 1 more2026-03-12🤖 cs.AI

Long Chain-of-Thought Compression via Fine-Grained Group Policy Optimization

This paper introduces Fine-grained Group Policy Optimization (FGO), a reinforcement learning algorithm that effectively compresses verbose Chain-of-Thought reasoning in Large Language Models while simultaneously addressing the data inefficiency and entropy collapse limitations of Group Relative Policy Optimization (GRPO).

Xinchen Han, Hossam Afifi, Michel Marot, Xilu Wang, Lu Yin2026-03-12🤖 cs.LG

UniWeTok: An Unified Binary Tokenizer with Codebook Size $\mathit{2^{128}}$ for Unified Multimodal Large Language Model

UniWeTok is a unified binary tokenizer featuring a massive $2^{128}$ codebook, a convolution-attention hybrid architecture with SigLu activation, and a novel three-stage training framework that achieves state-of-the-art performance in image generation and multimodal understanding with significantly lower computational costs than existing models.

Shaobin Zhuang, Yuang Ai, Jiaming Han, Weijia Mao, Xiaohui Li, Fangyikang Wang, Xiao Wang, Yan Li, Shanchuan Lin, Kun Xu, Zhenheng Yang, Huaibo Huang, Xiangyu Yue, Hao Chen, Yali Wang2026-03-12🤖 cs.AI

← Previous Next →

cs.AI