cs.RO papers | Gist.Science

Caterpillar-Inspired Spring-Based Compressive Continuum Robot for Bristle-based Exploration

This paper presents a compact, spring-based, tendon-driven continuum robot inspired by caterpillar locomotion and equipped with artificial bristle sensors, which integrates with commercial robotic arms to enable effective, compliant exploration and surface perception in confined spaces with a mean position error of 4.32 mm.

Zhixian Hu, Yu She, Juan WachsWed, 11 Ma💻 cs

Let's Reward Step-by-Step: Step-Aware Contrastive Alignment for Vision-Language Navigation in Continuous Environments

This paper introduces Step-Aware Contrastive Alignment (SACA), a novel framework that enhances Vision-Language Navigation in Continuous Environments by utilizing a perception-grounded auditor to extract dense, step-level supervision from imperfect trajectories, thereby overcoming the limitations of compounding errors in supervised fine-tuning and sparse rewards in reinforcement fine-tuning to achieve state-of-the-art performance.

Haoyuan Li, Rui Liu, Hehe Fan, Yi YangWed, 11 Ma💻 cs

Robotic Scene Cloning:Advancing Zero-Shot Robotic Scene Adaptation in Manipulation via Visual Prompt Editing

This paper introduces Robotic Scene Cloning (RSC), a novel method that enhances zero-shot robotic manipulation by editing existing operation trajectories through visual prompting and condition injection to generate accurate, scene-consistent samples that significantly improve policy generalization in real-world environments.

Binyuan Huang, Yuqing Wen, Yucheng Zhao, Yaosi Hu, Tiancai Wang, Chang Wen Chen, Haoqiang Fan, Zhenzhong ChenWed, 11 Ma💻 cs

DRIFT: Dual-Representation Inter-Fusion Transformer for Automated Driving Perception with 4D Radar Point Clouds

This paper introduces DRIFT, a dual-path Transformer model that effectively fuses fine-grained local and coarse-grained global features from sparse 4D radar point clouds to achieve state-of-the-art performance in automated driving perception tasks like object detection and free road estimation.

Siqi Pei, Andras Palffy, Dariu M. GavrilaWed, 11 Ma💻 cs

OTPL-VIO: Robust Visual-Inertial Odometry with Optimal Transport Line Association and Adaptive Uncertainty

This paper presents OTPL-VIO, a robust stereo visual-inertial odometry system that enhances performance in low-texture and illumination-challenging environments by employing a training-free deep descriptor with entropy-regularized optimal transport for line association and introducing adaptive uncertainty weighting to stabilize estimation.

Zikun Chen, Wentao Zhao, Yihe Niu, Tianchen Deng, Jingchuan WangWed, 11 Ma💻 cs

A Generalized Voronoi Graph based Coverage Control Approach for Non-Convex Environment

This paper proposes a two-phase coverage control method based on the Generalized Voronoi Graph that achieves efficient multi-robot coverage in non-convex environments by first partitioning the region and balancing robot allocation according to sub-region quality, followed by a collaborative coverage phase with a new controller.

Zuyi Guo, Ronghao Zheng, Meiqin Liu, Senlin ZhangWed, 11 Ma💻 cs

Towards Terrain-Aware Safe Locomotion for Quadrupedal Robots Using Proprioceptive Sensing

This paper presents a proprioception-only framework for quadrupedal robots that combines a 2.5-D terrain estimation method with safety-critical control barrier functions to achieve robust state estimation and rigorous safety guarantees on uneven terrain.

Peiyu Yang, Jiatao Ding, Wei Pan, Claudio Semini, Cosimo Della SantinaWed, 11 Ma💻 cs

ReTac-ACT: A State-Gated Vision-Tactile Fusion Transformer for Precision Assembly

ReTac-ACT is a state-gated vision-tactile fusion transformer that achieves high-precision assembly in occluded, contact-rich environments by dynamically prioritizing tactile feedback through bidirectional cross-attention and proprioception-conditioned gating, outperforming vision-only baselines on the NIST Assembly Task Board M1 benchmark.

Minchi Ruan, LiangQing Zhou, Hongtong Li, Zongtao Wang, ZhaoMing Lu, Jianwei Zhang, Bin FangWed, 11 Ma💻 cs

Trajectory Optimization for Self-Wrap-Aware Cable-Towed Planar Object Manipulation under Implicit Tension Constraints

This paper formulates cable-towed planar object manipulation as a routing-aware, tensioning-implicit trajectory optimization problem that leverages self-wrapping to dynamically redirect torque, proposing a relaxation hierarchy where the Implicit-Mode Relaxation (IMR) effectively exploits self-wrap for turning maneuvers without the conservatism of explicit routing decisions.

Yu Li, Amin Fakhari, Hamid SadeghianWed, 11 Ma💻 cs

On the Cost of Evolving Task Specialization in Multi-Robot Systems

This study demonstrates that in multi-robot foraging scenarios with limited optimization budgets, evolving task-specialized controllers fails to improve efficiency and often underperforms compared to successfully optimized generalist behaviors.

Paolo Leopardi, Heiko Hamann, Jonas Kuckling, Tanja Katharina KaiserWed, 11 Ma💻 cs

NS-VLA: Towards Neuro-Symbolic Vision-Language-Action Models

This paper introduces NS-VLA, a novel Neuro-Symbolic Vision-Language-Action framework that integrates symbolic encoding, solving, and online reinforcement learning to achieve superior data efficiency, zero-shot generalizability, and expanded exploration in robotic manipulation compared to existing methods.

Ziyue Zhu, Shangyang Wu, Shuai Zhao, Zhiqiu Zhao, Shengjie Li, Yi Wang, Fang Li, Haoran LuoWed, 11 Ma💻 cs

Beyond Short-Horizon: VQ-Memory for Robust Long-Horizon Manipulation in Non-Markovian Simulation Benchmarks

This paper introduces RuleSafe, a new long-horizon articulated manipulation benchmark featuring non-Markovian safe-unlocking tasks, and proposes VQ-Memory, a vector-quantized temporal representation that significantly enhances the planning, generalization, and efficiency of Vision-Language-Action models in complex robotic simulations.

Wang Honghui, Jing Zhi, Ao Jicong, Song Shiji, Li Xuelong, Huang Gao, Bai ChenjiaWed, 11 Ma💻 cs

Context-Nav: Context-Driven Exploration and Viewpoint-Aware 3D Spatial Reasoning for Instance Navigation

The paper presents Context-Nav, a training-free framework for text-goal instance navigation that combines caption-driven frontier ranking for global exploration with viewpoint-aware 3D spatial verification to accurately disambiguate target objects in cluttered environments, achieving state-of-the-art performance on InstanceNav and CoIN-Bench.

Won Shik Jang, Ue-Hwan KimWed, 11 Ma💻 cs

StyleVLA: Driving Style-Aware Vision Language Action Model for Autonomous Driving

StyleVLA is a physics-informed Vision Language Action model built on Qwen3-VL-4B that generates diverse, kinematically feasible driving trajectories tailored to specific styles, significantly outperforming state-of-the-art proprietary models on domain-specific autonomous driving tasks.

Yuan Gao, Dengyuan Hua, Mattia Piccinini, Finn Rasmus Schäfer, Korbinian Moller, Lin Li, Johannes BetzWed, 11 Ma💻 cs

SEA-Nav: Efficient Policy Learning for Safe and Agile Quadruped Navigation in Cluttered Environments

The paper introduces SEA-Nav, a reinforcement learning framework that combines differentiable control barrier functions, adaptive collision replay, and kinematic constraints to enable quadruped robots to achieve safe, agile, and efficient navigation in densely cluttered environments with minute-level training time.

Shiyi Chen, Mingye Yang, Haiyan Mao, Jiaqi Zhang, Haiyi Liu, Shuheng He, Debing Zhang, Zihao Qiu, Chun ZhangWed, 11 Ma💻 cs

Stein Variational Ergodic Surface Coverage with SE(3) Constraints

This paper introduces a preconditioned SE(3) Stein Variational Gradient Descent framework that reformulates point-cloud surface coverage as a manifold-aware sampling problem, enabling robots to generate high-quality, SE(3)-constrained trajectories that outperform existing optimization-based and sampling-as-optimization methods in both simulation and real-world experiments.

Jiayun Li, Yufeng Jin, Sangli Teng, Dejian Gong, Georgia ChalvatzakiWed, 11 Ma💻 cs

Vision-Augmented On-Track System Identification for Autonomous Racing via Attention-Based Priors and Iterative Neural Correction

This paper proposes a vision-augmented, iterative system identification framework that combines a lightweight CNN for friction priors and an S4 model for temporal dynamics to overcome cold-start failures and improve real-time tire parameter estimation for autonomous racing.

Zhiping Wu, Cheng Hu, Yiqin Wang, Lei Xie, Hongye SuWed, 11 Ma💻 cs

NLiPsCalib: An Efficient Calibration Framework for High-Fidelity 3D Reconstruction of Curved Visuotactile Sensors

The paper presents NLiPsCalib, an efficient and physics-consistent calibration framework that utilizes Near-Light Photometric Stereo and controllable light sources to enable high-fidelity 3D reconstruction of curved visuotactile sensors through simple contacts with everyday objects, thereby overcoming the cost and complexity of existing methods.

Xuhao Qin, Feiyu Zhao, Yatao Leng, Runze Hu, Chenxi XiaoWed, 11 Ma💻 cs

CORAL: Scalable Multi-Task Robot Learning via LoRA Experts

CORAL is a scalable, embodiment-agnostic framework that mitigates multi-task interference and catastrophic forgetting in Vision-Language-Action models by freezing a shared backbone and dynamically routing language instructions to task-specific, lightweight LoRA experts with zero inference overhead.

Yuankai Luo, Woping Chen, Tong Liang, Zhenguo LiWed, 11 Ma💻 cs

See, Plan, Rewind: Progress-Aware Vision-Language-Action Models for Robust Robotic Manipulation

The paper introduces See, Plan, Rewind (SPR), a progress-aware vision-language-action framework that enhances robotic manipulation robustness by dynamically grounding instructions into spatial subgoals and enabling closed-loop error recovery through state rewinding, achieving state-of-the-art performance on challenging benchmarks without additional training.

Tingjun Dai, Mingfei Han, Tingwen Du, Zhiheng Liu, Zhihui Li, Salman Khan, Jun Yu, Xiaojun ChangWed, 11 Ma💻 cs

← Previous Next →