HarvestFlex: Strawberry Harvesting via Vision-Language-Action Policy Adaptation in the Wild

This paper introduces HarvestFlex, the first study demonstrating that vision-language-action policies can be successfully adapted to real-world greenhouse strawberry harvesting using a closed-loop system with three-view RGB sensing and minimal teleoperated data, achieving a 74.0% success rate without relying on depth sensors or explicit geometric calibration.

Ziyang Zhao, Shuheng Wang, Zhonghua Miao, Ya Xiong2026-03-09💻 cs

MagRobot:An Open Simulator for Magnetically Navigated Robots

This paper introduces MagRobot, the first universal open-source simulation platform designed to overcome the cost and consistency challenges of experimental prototyping by enabling the efficient design, visualization, and benchmarking of magnetically navigated rigid and soft robots across diverse medical applications.

Heng Wang (South China University of Technology), Haoyu Song (South China University of Technology), Jiatao Zheng (South China University of Technology), Yuxiang Han (South China University of Technology), Kunli Wang (South China University of Technology)2026-03-09💻 cs

Moving Through Clutter: Scaling Data Collection and Benchmarking for 3D Scene-Aware Humanoid Locomotion via Virtual Reality

This paper introduces Moving Through Clutter (MTC), an open-source Virtual Reality framework that addresses the lack of data for scene-aware humanoid locomotion by procedurally generating diverse 3D cluttered environments, capturing whole-body human motion, and providing a benchmarked dataset of 348 trajectories to advance robot navigation in complex, real-world settings.

Beichen Wang, Yuanjie Lu, Linji Wang, Liuchuan Yu, Xuesu Xiao2026-03-09💻 cs

Devil is in Narrow Policy: Unleashing Exploration in Driving VLA Models

The paper introduces Curious-VLA, a two-stage framework that overcomes the exploration limitations of standard driving VLA models by employing Feasible Trajectory Expansion during imitation learning and Adaptive Diversity-Aware Sampling with a Spanning Driving Reward during reinforcement learning, achieving state-of-the-art performance on the Navsim benchmark.

Canyu Chen, Yuguang Yang, Zhewen Tan, Yizhi Wang, Ruiyi Zhan, Haiyan Liu, Xuanyao Mao, Jason Bao, Xinyue Tang, Linlin Yang, Bingchuan Sun, Yan Wang, Baochang Zhang2026-03-09💻 cs

Transforming Omnidirectional RGB-LiDAR data into 3D Gaussian Splatting

This paper presents a novel pipeline that transforms archived omnidirectional RGB-LiDAR logs into robust 3D Gaussian Splatting initialization assets by addressing sensor distortion and data density challenges through ERP-to-cubemap conversion, color-stratified downsampling, and multi-modal registration, thereby enabling the creation of high-fidelity digital twins from standard, underutilized sensor data.

Semin Bae, Hansol Lim, Jongseong Brad Choi2026-03-09💻 cs

Lifelong Embodied Navigation Learning

This paper introduces Uni-Walker, a lifelong embodied navigation framework that addresses catastrophic forgetting in large language model-based agents by decoupling navigation knowledge into shared and task-specific components using DE-LoRA, knowledge inheritance, and expert subspace orthogonality to enable continuous adaptation across diverse scenes and instruction styles.

Xudong Wang, Jiahua Dong, Baichen Liu, Qi Lyu, Lianqing Liu, Zhi Han2026-03-09🤖 cs.AI

Multimodal Behavior Tree Generation: A Small Vision-Language Model for Robot Task Planning

This paper proposes a method to fine-tune compact, open-source vision-language models (500M–4B parameters) to generate executable behavior trees for robotic task planning by constructing a novel dataset from existing robotic episodes, achieving an 87% success rate in household tasks that rivals state-of-the-art closed-source models while using significantly fewer computational resources.

Cristiano Battistini, Riccardo Andrea Izzo, Gianluca Bardaro, Matteo Matteucci2026-03-09💻 cs

Sticky-Glance: Robust Intent Recognition for Human Robot Collaboration via Single-Glance

The paper proposes "Sticky-Glance," an object-centric gaze grounding framework that achieves robust intent recognition for human-robot collaboration using minimal gaze samples by stabilizing attention through a sticky-glance algorithm, thereby significantly improving selection accuracy, tracking rates, and overall task efficiency compared to existing baselines.

Yuzhi Lai, Shenghai Yuan, Peizheng Li, Andreas Zell2026-03-09💻 cs

Dual-Agent Multiple-Model Reinforcement Learning for Event-Triggered Human-Robot Co-Adaptation in Decoupled Task Spaces

This paper proposes a Dual-Agent Multiple-Model Reinforcement Learning (DAMMRL) framework for a shared-control 6-DoF rehabilitation robot that utilizes an event-triggered strategy to decouple human and robot tasks, dynamically optimizing co-adaptation by allowing the human to select speed-accuracy trade-offs while the robot adjusts its motion steps to suppress oscillations and improve task success rates.

Yaqi Li, Zhengqi Han, Huifang Liu, Steven W. Su2026-03-09💻 cs

KISS-IMU: Self-supervised Inertial Odometry with Motion-balanced Learning and Uncertainty-aware Inference

KISS-IMU is a novel self-supervised inertial odometry framework that eliminates the need for ground truth data by leveraging LiDAR-based ICP registration and pose graph optimization as supervisory signals, while employing motion-balanced training and uncertainty-aware inference to ensure robustness across diverse robotic platforms and environments.

Jiwon Choi, Hogyun Kim, Geonmo Yang, Juhui Lee, Younggun Cho2026-03-09💻 cs