NavSpace: How Navigation Agents Follow Spatial Intelligence Instructions

This paper introduces the NavSpace benchmark to systematically evaluate the spatial intelligence of navigation agents through six task categories and 1,228 trajectory-instruction pairs, revealing limitations in current models and proposing SNav, a new spatially intelligent navigation model that outperforms existing agents on both the benchmark and real robot tests.

Haolin Yang, Yuxing Long, Zhuoyuan Yu, Zihan Yang, Minghan Wang, Jiapeng Xu, Yihan Wang, Ziyan Yu, Wenzhe Cai, Lei Kang, Hao Dong2026-03-11🤖 cs.AI

RECODE: Reasoning Through Code Generation for Visual Question Answering

The paper introduces RECODE, an agentic framework that enhances visual question answering by reverse-engineering structured visuals into executable code through iterative generation and selection, thereby transforming ambiguous perceptual tasks into verifiable symbolic reasoning problems that significantly outperform existing methods.

Junhong Shen, Mu Cai, Bo Hu, Ameet Talwalkar, David A Ross, Cordelia Schmid, Alireza Fathi2026-03-11🤖 cs.AI

RL-100: Performant Robotic Manipulation with Real-World Reinforcement Learning

RL-100 is a unified real-world reinforcement learning framework that combines diffusion visuomotor policies with a clipped PPO objective and consistency distillation to achieve 100% success across 1,000 diverse robotic manipulation trials, matching or surpassing human experts while demonstrating robust zero-shot generalization and continuous deployment in dynamic environments.

Kun Lei, Huanyu Li, Dongjie Yu, Zhenyu Wei, Lingxiao Guo, Zhennan Jiang, Ziyu Wang, Shiyu Liang, Huazhe Xu2026-03-11🤖 cs.AI

From Spatial to Actions: Grounding Vision-Language-Action Model in Spatial Foundation Priors

FALCON addresses the spatial reasoning limitations of existing 2D-based vision-language-action models by leveraging spatial foundation models to inject rich 3D geometric priors directly into the action head, achieving state-of-the-art performance across diverse simulation and real-world tasks without requiring architectural changes or specialized sensors.

Zhengshen Zhang, Hao Li, Yalun Dai, Zhengbang Zhu, Lei Zhou, Chenchen Liu, Dong Wang, Francis E. H. Tay, Sijin Chen, Ziwei Liu, Yuxiao Liu, Xinghang Li, Pan Zhou2026-03-11🤖 cs.AI

SynHLMA:Synthesizing Hand Language Manipulation for Articulated Object with Discrete Human Object Interaction Representation

This paper introduces SynHLMA, a novel framework that synthesizes hand manipulation sequences for articulated objects by aligning natural language instructions with a discrete human-object interaction representation, thereby enabling robust grasp generation, prediction, and interpolation for applications in embodied AI and robotics.

Wang zhi, Yuyan Liu, Liu Liu, Li Zhang, Ruixuan Lu, Dan Guo2026-03-11🤖 cs.AI

GraphKeeper: Graph Domain-Incremental Learning via Knowledge Disentanglement and Preservation

The paper proposes GraphKeeper, a novel framework for Graph Domain-Incremental Learning that addresses catastrophic forgetting through knowledge disentanglement and deviation-free preservation, achieving state-of-the-art performance across multiple graph domains while remaining compatible with various graph foundation models.

Zihao Guo, Qingyun Sun, Ziwei Zhang, Haonan Yuan, Huiping Zhuang, Xingcheng Fu, Jianxin Li2026-03-11🤖 cs.AI

Structured Matrix Scaling for Multi-Class Calibration

This paper proposes a structured matrix scaling approach for multi-class calibration that leverages theoretical insights from logistic regression, combined with structured regularization and robust optimization, to effectively manage the bias-variance tradeoff and achieve substantial performance gains over existing methods while providing an open-source implementation.

Eugène Berta, David Holzmüller, Michael I. Jordan, Francis Bach2026-03-11🤖 cs.AI

When Robots Obey the Patch: Universal Transferable Patch Attacks on Vision-Language-Action Models

This paper introduces UPA-RFAS, a unified framework that generates universal and transferable physical adversarial patches to effectively attack diverse Vision-Language-Action (VLA) models across unknown architectures, finetuned variants, and sim-to-real shifts by leveraging robust feature alignment, a two-phase min-max optimization, and VLA-specific attention and semantic losses.

Hui Lu, Yi Yu, Yiming Yang, Chenyu Yi, Qixin Zhang, Bingquan Shen, Alex C. Kot, Xudong Jiang2026-03-11🤖 cs.AI

Multi-Agent Reinforcement Learning with Communication-Constrained Priors

This paper proposes a communication-constrained multi-agent reinforcement learning framework that utilizes a generalized model and dual mutual information estimator to distinguish between lossy and lossless messages, thereby quantifying their impact on global rewards to enhance cooperative policy learning in complex, dynamic environments.

Guang Yang, Tianpei Yang, Jingwen Qiao, Yanqing Wu, Jing Huo, Xingguo Chen, Yang Gao2026-03-11🤖 cs.AI

Enhancing Retrieval-Augmented Generation with Entity Linking for Educational Platforms

This paper introduces ELERAG, an enhanced Retrieval-Augmented Generation system that integrates Wikidata-based Entity Linking and a hybrid re-ranking strategy to significantly improve factual accuracy in Italian educational question-answering, particularly outperforming standard methods in domain-specific contexts while demonstrating the importance of domain-adapted strategies.

Francesco Granata, Francesco Poggi, Misael Mongiovì2026-03-11🤖 cs.AI

EMFusion: Conditional Diffusion Framework for Trustworthy Frequency Selective EMF Forecasting in Wireless Networks

This paper introduces EMFusion, a conditional multivariate diffusion-based framework that leverages a residual U-Net with cross-attention and imputation-based sampling to provide accurate, uncertainty-quantified, frequency-selective electromagnetic field forecasts for wireless network planning, significantly outperforming existing baseline models.

Zijiang Yan, Yixiang Huang, Jianhua Pei, Hina Tabassum, Luca Chiaraviglio2026-03-11🤖 cs.AI

Small Language Models for Efficient Agentic Tool Calling: Outperforming Large Models with Targeted Fine-tuning

This paper demonstrates that a single-epoch, domain-adapted fine-tuning of a 350M-parameter Small Language Model (OPT-350M) can significantly outperform larger models and existing baselines in tool-calling tasks, achieving a 77.55% pass rate on ToolBench and proving that targeted training can make generative AI more cost-effective and scalable for enterprise use.

Polaris Jhandi, Owais Kazi, Shreyas Subramanian, Neel Sendas2026-03-11🤖 cs.AI