Morphological-Symmetry-Equivariant Heterogeneous Graph Neural Network for Robotic Dynamics Learning

This paper introduces MS-HGNN, a morphological-symmetry-equivariant heterogeneous graph neural network that integrates robotic kinematic structures and symmetries as architectural constraints to achieve high generalizability and efficiency in learning dynamics for various multi-body systems, with its effectiveness validated through formal proofs and experiments on quadruped robots.

Fengze Xie, Sizhe Wei, Yue Song, Yisong Yue, Lu GanWed, 11 Ma🤖 cs.LG

SCDP: Learning Humanoid Locomotion from Partial Observations via Mixed-Observation Distillation

The paper introduces Sensor-Conditioned Diffusion Policies (SCDP), a novel framework that enables robust humanoid locomotion using only onboard sensors by distilling privileged full-body knowledge through mixed-observation training and specialized denoising techniques, successfully achieving near-perfect simulation performance and real-world deployment on a G1 robot without explicit state estimation.

Milo Carroll, Tianhu Peng, Lingfan Bao, Chengxu Zhou, Zhibin LiWed, 11 Ma🤖 cs.LG

Quality over Quantity: Demonstration Curation via Influence Functions for Data-Centric Robot Learning

This paper introduces Quality over Quantity (QoQ), a systematic framework that leverages influence functions to automatically curate high-quality robot learning demonstrations by quantifying each sample's contribution to reducing validation loss, thereby significantly improving policy performance over manual or heuristic data selection methods.

Haeone Lee, Taywon Min, Junsu Kim, Sinjae Kang, Fangchen Liu, Lerrel Pinto, Kimin LeeWed, 11 Ma🤖 cs.LG

Why Channel-Centric Models are not Enough to Predict End-to-End Performance in Private 5G: A Measurement Campaign and Case Study

This paper demonstrates that channel-centric models, including ray-tracing simulators, fail to accurately predict end-to-end throughput in private 5G networks due to systematic over-estimation of MIMO spatial layers, whereas data-driven Gaussian process models trained on direct measurements provide significantly more reliable predictions for communication-aware robot planning.

Nils JörgensenWed, 11 Ma🤖 cs.LG

SPREAD: Subspace Representation Distillation for Lifelong Imitation Learning

The paper introduces SPREAD, a geometry-preserving framework for lifelong imitation learning that utilizes singular value decomposition to align policy representations within low-rank subspaces and a confidence-guided distillation strategy to mitigate catastrophic forgetting while achieving state-of-the-art performance on the LIBERO benchmark.

Kaushik Roy, Giovanni D'urso, Nicholas Lawrance, Brendan Tidd, Peyman MoghadamWed, 11 Ma🤖 cs.LG

Pri4R: Learning World Dynamics for Vision-Language-Action Models with Privileged 4D Representation

Pri4R is a simple yet effective method that enhances Vision-Language-Action models with an implicit understanding of world dynamics by training them to predict 3D point tracks using privileged 4D information, thereby significantly improving physical manipulation performance without adding inference overhead.

Jisoo Kim, Jungbin Cho, Sanghyeok Chu, Ananya Bal, Jinhyung Kim, Gunhee Lee, Sihaeng Lee, Seung Hwan Kim, Bohyung Han, Hyunmin Lee, Laszlo A. Jeni, Seungryong KimWed, 11 Ma🤖 cs.AI

SynHLMA:Synthesizing Hand Language Manipulation for Articulated Object with Discrete Human Object Interaction Representation

This paper introduces SynHLMA, a novel framework that synthesizes hand manipulation sequences for articulated objects by aligning natural language instructions with a discrete human-object interaction representation, thereby enabling robust grasp generation, prediction, and interpolation for applications in embodied AI and robotics.

Wang zhi, Yuyan Liu, Liu Liu, Li Zhang, Ruixuan Lu, Dan GuoWed, 11 Ma🤖 cs.AI

From Spatial to Actions: Grounding Vision-Language-Action Model in Spatial Foundation Priors

FALCON addresses the spatial reasoning limitations of existing 2D-based vision-language-action models by leveraging spatial foundation models to inject rich 3D geometric priors directly into the action head, achieving state-of-the-art performance across diverse simulation and real-world tasks without requiring architectural changes or specialized sensors.

Zhengshen Zhang, Hao Li, Yalun Dai, Zhengbang Zhu, Lei Zhou, Chenchen Liu, Dong Wang, Francis E. H. Tay, Sijin Chen, Ziwei Liu, Yuxiao Liu, Xinghang Li, Pan ZhouWed, 11 Ma🤖 cs.AI

RL-100: Performant Robotic Manipulation with Real-World Reinforcement Learning

RL-100 is a unified real-world reinforcement learning framework that combines diffusion visuomotor policies with a clipped PPO objective and consistency distillation to achieve 100% success across 1,000 diverse robotic manipulation trials, matching or surpassing human experts while demonstrating robust zero-shot generalization and continuous deployment in dynamic environments.

Kun Lei, Huanyu Li, Dongjie Yu, Zhenyu Wei, Lingxiao Guo, Zhennan Jiang, Ziyu Wang, Shiyu Liang, Huazhe XuWed, 11 Ma🤖 cs.AI

NavSpace: How Navigation Agents Follow Spatial Intelligence Instructions

This paper introduces the NavSpace benchmark to systematically evaluate the spatial intelligence of navigation agents through six task categories and 1,228 trajectory-instruction pairs, revealing limitations in current models and proposing SNav, a new spatially intelligent navigation model that outperforms existing agents on both the benchmark and real robot tests.

Haolin Yang, Yuxing Long, Zhuoyuan Yu, Zihan Yang, Minghan Wang, Jiapeng Xu, Yihan Wang, Ziyan Yu, Wenzhe Cai, Lei Kang, Hao DongWed, 11 Ma🤖 cs.AI

LLM-Advisor: An LLM Benchmark for Cost-efficient Path Planning across Multiple Terrains

The paper introduces LLM-Advisor, a prompt-based framework that leverages large language models as non-decisive post-processing advisors to significantly improve the cost efficiency of path planning across diverse terrains without modifying underlying planners, while addressing hallucination risks and demonstrating superior performance over zero-shot LLM approaches.

Ling Xiao, Toshihiko YamasakiWed, 11 Ma🤖 cs.AI

Enhancing Heterogeneous Multi-Agent Cooperation in Decentralized MARL via GNN-driven Intrinsic Rewards

This paper proposes CoHet, a novel algorithm that leverages Graph Neural Network-driven intrinsic rewards to enable effective decentralized learning and cooperation among heterogeneous multi-agent systems despite challenges like partial observability and reward sparsity, demonstrating superior performance over state-of-the-art methods in standard benchmarks.

Jahir Sadik Monon, Deeparghya Dutta Barua, Md. Mosaddek KhanWed, 11 Ma🤖 cs.AI

Emerging Extrinsic Dexterity in Cluttered Scenes via Dynamics-aware Policy Learning

This paper introduces the Dynamics-Aware Policy Learning (DAPL) framework, which leverages explicit world modeling to learn contact-induced dynamics, enabling robots to achieve robust extrinsic dexterity in cluttered environments without hand-crafted heuristics and significantly outperforming existing manipulation methods in both simulation and real-world deployments.

Yixin Zheng, Jiangran Lyu, Yifan Zhang, Jiayi Chen, Mi Yan, Yuntian Deng, Xuesong Shi, Xiaoguang Zhao, Yizhou Wang, Zhizheng Zhang, He WangWed, 11 Ma🤖 cs.AI

Open-World Motion Forecasting

This paper introduces "Open-World Motion Forecasting," an end-to-end class-incremental framework that predicts future trajectories directly from camera images while mitigating catastrophic forgetting through pseudo-labeling with vision-language models and a novel query feature variance-based replay strategy, enabling continual adaptation to evolving object taxonomies in real-world autonomous driving.

Nicolas Schischka, Nikhil Gosala, B Ravi Kiran, Senthil Yogamani, Abhinav ValadaWed, 11 Ma🤖 cs.AI