R2F: Repurposing Ray Frontiers for LLM-free Object Navigation

The paper proposes R2F, an LLM-free framework for zero-shot open-vocabulary object navigation that repurposes ray frontiers as direction-conditioned semantic hypotheses to achieve competitive performance with real-time execution, eliminating the latency and computational overhead of iterative large-model queries.

Francesco Argenziano, John Mark Alexis Marcelo, Michele Brienza, Abdel Hakim Drid, Emanuele Musumeci, Daniele Nardi, Domenico D. Bloisi, Vincenzo SurianiTue, 10 Ma💻 cs

FoMo: A Multi-Season Dataset for Robot Navigation in Forêt Montmorency

The FoMo dataset presents a comprehensive, multi-season collection of over 64 km of diverse robot navigation data from a boreal forest, featuring significant environmental changes like heavy snow and vegetation growth to challenge and evaluate the robustness of state-of-the-art odometry and SLAM systems.

Matej Boxan, Gabriel Jeanson, Alexander Krawciw, Effie Daum, Xinyuan Qiao, Sven Lilge, Timothy D. Barfoot, François PomerleauTue, 10 Ma💻 cs

Tactile Recognition of Both Shapes and Materials with Automatic Feature Optimization-Enabled Meta Learning

This paper proposes the AFOP-ML framework, an automatic feature optimization-enabled prototypical network that achieves rapid few-shot tactile recognition of both shapes and materials with high accuracy and robustness against perturbations, effectively addressing the challenges of data scarcity and time-consuming training in robotic applications.

Hongliang Zhao, Wenhui Yang, Yang Chen, Zhuorui Wang, Baiheng Liu, Longhui QinTue, 10 Ma💻 cs

Human-Aware Robot Behaviour in Self-Driving Labs

This paper proposes an AI-driven perception method with hierarchical human intention prediction to enable mobile robot chemists in self-driving laboratories to proactively distinguish between human preparatory actions and transient interactions, thereby overcoming the inefficiencies of passive obstruction detection and streamlining human-robot coordination in shared-access scenarios.

Satheeshkumar Veeramani, Anna Kisil, Abigail Bentley, Hatem Fakhruldeen, Gabriella Pizzuto, Andrew I. CooperTue, 10 Ma💻 cs

MoMaStage: Skill-State Graph Guided Planning and Closed-Loop Execution for Long-Horizon Indoor Mobile Manipulation

MoMaStage is a structured vision-language framework that enables robust long-horizon indoor mobile manipulation by guiding task planning through a topology-aware Skill-State Graph and ensuring execution reliability via a closed-loop mechanism that triggers semantic replanning upon detecting physical deviations, all without requiring explicit scene mapping.

Chenxu Li, Zixuan Chen, Yetao Li, Jiapeng Xu, Hongyu Ding, Jieqi Shi, Jing Huo, Yang GaoTue, 10 Ma💻 cs

PhaForce: Phase-Scheduled Visual-Force Policy Learning with Slow Planning and Fast Correction for Contact-Rich Manipulation

PhaForce is a phase-scheduled visuomotor policy that enhances contact-rich manipulation by coordinating a slow, vision-dominant diffusion planner with a fast, force-driven corrector to enable high-frequency, phase-aware residual corrections, achieving an 86% success rate and superior adaptability compared to existing baselines.

Mingxin Wang, Zhirun Yue, Renhao Lu, Yizhe Li, Zihan Wang, Guoping Pan, Kangkang Dong, Jun Cheng, Yi Cheng, Houde LiuTue, 10 Ma💻 cs

Hierarchical Multi-Modal Planning for Fixed-Altitude Sparse Target Search and Sampling

This paper introduces HIMoS, a hierarchical multi-modal planning framework that enables Autonomous Underwater Vehicles to efficiently search for and sample sparse benthic targets like coral colonies at a fixed altitude by integrating a global topological route optimizer with a local differentiable belief propagation planner, thereby outperforming traditional exhaustive and adaptive sampling strategies in high-fidelity simulations.

Lingpeng Chen, Yuchen Zheng, Apple Pui-Yi Chui, Junfeng Wu, Ziyang HongTue, 10 Ma💻 cs

Seed2Scale: A Self-Evolving Data Engine for Embodied AI via Small to Large Model Synergy and Multimodal Evaluation

Seed2Scale is a self-evolving data engine that overcomes data bottlenecks in embodied AI by synergizing a lightweight "SuperTiny" model for robust data collection with a large Vision-Language Model for autonomous quality verification, enabling a target model to achieve a 131.2% performance improvement starting from just four seed demonstrations.

Cong Tai, Zhaoyu Zheng, Haixu Long, Hansheng Wu, Zhengbin Long, Haodong Xiang, Rong Shi, Zhuo Cui, Shizhuang Zhang, Gang Qiu, He Wang, Ruifeng Li, Biao Liu, Zhenzhe Sun, Tao ShenTue, 10 Ma💻 cs

Multifingered force-aware control for humanoid robots

This paper presents a model-based control framework for humanoid robots that utilizes trained tactile force estimators to dynamically redistribute forces across the torso, arm, wrist, and fingers, thereby maintaining stable contact with objects of varying mass or unstable configurations by minimizing the distance between the Center of Pressure and the contact polygon centroid.

Pasquale Marra, Gabriele M. Caddeo, Ugo Pattacini, Lorenzo NataleTue, 10 Ma💻 cs

POIROT: Investigating Direct Tangible vs. Digitally Mediated Interaction and Attitude Moderation in Multi-party Murder Mystery Games

This study challenges the assumption that physical robot interaction universally enhances user experience by demonstrating that while tangible delivery does not inherently improve engagement, it significantly reduces narrative immersion for individuals with high negative attitudes toward robots, who instead benefit from digitally mediated interfaces as a social buffer.

Wen Chen, Rongxi Chen, Shankai Chen, Huiyang Gong, Minghui Guo, Yingri Xu, Xintong Wu, Xinyi FuTue, 10 Ma💻 cs

UniGround: Universal 3D Visual Grounding via Training-Free Scene Parsing

UniGround introduces a novel, training-free framework for universal 3D visual grounding that leverages global candidate filtering and local precision reasoning to achieve state-of-the-art zero-shot performance in localizing arbitrary objects within complex 3D environments without relying on pre-trained models or 3D supervision.

Jiaxi Zhang, Yunheng Wang, Wei Lu, Taowen Wang, Weisheng Xu, Shuning Zhang, Yixiao Feng, Yuetong Fang, Renjing XuTue, 10 Ma💻 cs

Towards Human-Like Manipulation through RL-Augmented Teleoperation and Mixture-of-Dexterous-Experts VLA

This paper proposes an integrated framework combining RL-augmented teleoperation via the IMCopilot assistant and a Mixture-of-Dexterous-Experts VLA (MoDE-VLA) architecture to overcome data and learning bottlenecks, enabling robust human-like, contact-rich bimanual in-hand manipulation with significantly improved success rates.

Tutian Tang, Xingyu Ji, Wanli Xing, Ce Hao, Wenqiang Xu, Lin Shao, Cewu Lu, Qiaojun Yu, Jiangmiao Pang, Kaifeng ZhangTue, 10 Ma💻 cs