cs.RO papers | Gist.Science

Inverse Resistive Force Theory (I-RFT): Learning granular properties through robot-terrain physical interactions

This paper introduces Inverse Resistive Force Theory (I-RFT), a physics-informed machine learning framework that enables robots to accurately estimate granular terrain properties from proprioceptive contact forces under arbitrary gait trajectories, thereby facilitating data-efficient environmental characterization and adaptive locomotion strategies.

Shipeng Liu, Feng Xue, Yifeng Zhang, Tarunika Ponnusamy, Feifei QianTue, 10 Ma💻 cs

Toward Global Intent Inference for Human Motion by Inverse Reinforcement Learning

This paper demonstrates that a single, subject- and posture-agnostic time-varying cost function, efficiently estimated via the Minimal Observation Inverse Reinforcement Learning (MO-IRL) algorithm, can accurately predict human reaching movements by revealing a unified optimality principle dominated by joint-acceleration regulation.

Sarmad Mehrdad, Maxime Sabbah, Vincent Bonnet, Ludovic RighettiTue, 10 Ma🤖 cs.LG

MWM: Mobile World Models for Action-Conditioned Consistent Prediction

This paper introduces MWM, a mobile world model that enhances action-conditioned rollout consistency and inference efficiency for image-goal navigation through a novel two-stage training framework featuring Action-Conditioned Consistency post-training and Inference-Consistent State Distillation.

Han Yan, Zishang Xiang, Zeyu Zhang, Hao TangTue, 10 Ma💻 cs

Preference-Conditioned Reinforcement Learning for Space-Time Efficient Online 3D Bin Packing

The paper introduces STEP, a preference-conditioned reinforcement learning framework that optimizes robotic 3D bin packing by explicitly balancing spatial efficiency against operational time, achieving a 44% reduction in execution time without compromising packing density.

Nikita Sarawgi, Omey M. Manyar, Fan Wang, Thinh H. Nguyen, Daniel Seita, Satyandra K. GuptaTue, 10 Ma💻 cs

Uncertainty Mitigation and Intent Inference: A Dual-Mode Human-Machine Joint Planning System

This paper proposes a dual-mode human-robot joint planning system that combines an LLM-assisted active elicitation mechanism with real-time intent inference to effectively mitigate task-relevant knowledge gaps and latent human intent, significantly reducing interaction costs and execution time in open-world environments.

Zeyu Fang, Yuxin Lin, Cheng Liu, Beomyeol Yu, Zeyuan Yang, Rongqian Chen, Taeyoung Lee, Mahdi Imani, Tian LanTue, 10 Ma💻 cs

Reasoning Knowledge-Gap in Drone Planning via LLM-based Active Elicitation

This paper introduces MINT, a novel framework that enhances human-AI drone collaboration by using large language models to actively elicit minimal, targeted information from operators to resolve environmental uncertainties, thereby significantly improving task success rates while reducing the need for frequent human intervention.

Zeyu Fang, Beomyeol Yu, Cheng Liu, Zeyuan Yang, Rongqian Chen, Yuxin Lin, Mahdi Imani, Tian LanTue, 10 Ma💻 cs

Physics-infused Learning for Aerial Manipulator in Winds and Near-Wall Environments

This paper presents a unified control framework for aerial manipulators that integrates a physics-based blade-element model with a learning-based residual force estimator and online rotor-speed adaptation to achieve robust trajectory tracking and wall-contact operations in complex wind and near-wall environments.

Yiming Zhang, Junyi GengTue, 10 Ma💻 cs

Relating Reinforcement Learning to Dynamic Programming-Based Planning

This paper bridges the gap between dynamic programming-based planning and reinforcement learning by developing a derandomized RL variant, mathematically analyzing the conditions under which their differing formulations (such as cost minimization versus reward maximization and goal termination versus infinite-horizon discounting) are equivalent, and advocating for the optimization of true cost over arbitrary parameters.

Filip V. Georgiev, Kalle G. Timperi, Basak Sakçak, Steven M. LaValleTue, 10 Ma💻 cs

Viewpoint-Agnostic Grasp Pipeline using VLM and Partial Observations

This paper presents an end-to-end, viewpoint-agnostic grasping pipeline for mobile legged manipulators that leverages vision-language models and partial observation compensation to achieve robust, language-guided object selection and safe execution in cluttered environments, outperforming view-dependent baselines with a 90% success rate.

Dilermando Almeida, Juliano Negri, Guilherme Lazzarini, Thiago H. Segreto, Ranulfo Bezerra, Ricardo V. Godoy, Marcelo BeckerTue, 10 Ma🤖 cs.LG

Choose What to Observe: Task-Aware Semantic-Geometric Representations for Visuomotor Policy

This paper proposes a task-aware observation interface that canonicalizes raw RGB inputs into unified semantic-geometric representations using segmentation and depth injection, thereby significantly enhancing the robustness of visuomotor policies to out-of-distribution appearance shifts without requiring policy retraining.

Haoran Ding, Liang Ma, Yaxun Yang, Wen Yang, Tianyu Liu, Anqing Duan, Xiaodan Liang, Dezhen Song, Ivan Laptev, Yoshihiko NakamuraTue, 10 Ma💻 cs

Identifying Influential Actions in Human-Robot Interactions

This paper introduces a method using transfer entropy to identify influential robot actions during human-robot conversations, demonstrating its effectiveness in analyzing nonlinear interactions to improve robotic system design and adaptability.

Haoyang Jiang, Chenfei Xu, Yuya Okadome, Yukata NakamuraTue, 10 Ma💻 cs

RoboRouter: Training-Free Policy Routing for Robotic Manipulation

RoboRouter is a training-free framework that enhances robotic manipulation performance by intelligently routing diverse, off-the-shelf policies to the most suitable one for each task based on semantic representations and historical execution data, achieving significant success rate improvements in both simulation and real-world settings without requiring additional model training.

Yiteng Chen, Zhe Cao, Hongjia Ren, Chenjie Yang, Wenbo Li, Shiyi Wang, Yemin Wang, Li Zhang, Yanming Shao, Zhenjun Zhao, Huiping Zhuang, Qingyao WuTue, 10 Ma💻 cs

NaviDriveVLM: Decoupling High-Level Reasoning and Motion Planning for Autonomous Driving

NaviDriveVLM proposes a decoupled framework that separates high-level reasoning and motion planning using a large-scale Navigator and a lightweight Driver, achieving superior end-to-end performance on the nuScenes benchmark while reducing training costs and enhancing interpretability.

Ximeng Tao, Pardis Taghavi, Dimitar Filev, Reza Langari, Gaurav PandeyTue, 10 Ma🤖 cs.LG

DyQ-VLA: Temporal-Dynamic-Aware Quantization for Embodied Vision-Language-Action Models

DyQ-VLA is a dynamic quantization framework for Embodied Vision-Language-Action models that leverages real-time kinematic proxies to adaptively switch and allocate bit-widths, significantly reducing memory footprint and improving inference speed while maintaining near-original performance.

Zihao Zheng, Hangyu Cao, Sicheng Tian, Jiayu Chen, Maoliang Li, Xinhao Sun, Hailong Zou, Zhaobo Zhang, Xuanzhe Liu, Donggang Cao, Hong Mei, Xiang ChenTue, 10 Ma🤖 cs.LG

Long-Short Term Agents for Pure-Vision Bronchoscopy Robotic Autonomy

This paper presents a vision-only autonomous bronchoscopy framework utilizing hierarchical long-short agents and a world-model critic to achieve accurate, sensor-free intraoperative navigation in preclinical models, demonstrating performance comparable to expert human operators.

Junyang Wu, Mingyi Luo, Fangfang Xie, Minghui Zhang, Hanxiao Zhang, Chunxi Zhang, Junhao Wang, Jiayuan Sun, Yun Gu, Guang-Zhong YangTue, 10 Ma💻 cs

Omnidirectional Humanoid Locomotion on Stairs via Unsafe Stepping Penalty and Sparse LiDAR Elevation Mapping

This paper presents a robust framework for safe omnidirectional humanoid stair locomotion that combines a single-stage training strategy with dense unsafe stepping penalties and a refined sparse LiDAR elevation mapping system to achieve high success rates in both simulation and real-world deployments.

Yuzhi Jiang, Yujun Liang, Junhao Li, Han Ding, Lijun ZhuTue, 10 Ma💻 cs

Unified Structural-Hydrodynamic Modeling of Underwater Underactuated Mechanisms and Soft Robots

This paper proposes a trajectory-driven global optimization framework, inspired by CMA-ES, that enables unified, high-fidelity structural-hydrodynamic modeling of underwater underactuated and soft robotic systems by simultaneously identifying coupled internal and external parameters, achieving accurate real-to-sim consistency across diverse mechanisms without manual retuning.

Chenrui Zhang, Yiyuan Zhang, Yunfei Ye, Junkai Chen, Haozhe Wang, Cecilia LaschiTue, 10 Ma🔬 physics

RAPID: Redundancy-Aware and Compatibility-Optimal Edge-Cloud Partitioned Inference for Diverse VLA models

The paper introduces RAPID, a novel Edge-Cloud Collaborative inference framework designed to optimize the deployment of Vision Language Action models by addressing visual noise interference and step-wise task redundancy, thereby achieving up to a 1.73x speedup with minimal overhead.

Zihao Zheng, Sicheng Tian, Hangyu Cao, Chenyue Li, Jiayu Chen, Maoliang Li, Xinhao Sun, Hailong Zou, Guojie Luo, Xiang ChenTue, 10 Ma💻 cs

VORL-EXPLORE: A Hybrid Learning Planning Approach to Multi-Robot Exploration in Dynamic Environments

VORL-EXPLORE is a hybrid learning and planning framework for multi-robot exploration in dynamic environments that couples task allocation with motion execution via a shared navigability fidelity signal, enabling adaptive arbitration between global and reactive policies to prevent bottlenecks and ensure robust, collision-free coverage.

Ning Liu, Sen Shen, Zheng Li, Sheng Liu, Dongkun Han, Shangke Lyu, Thomas BraunlTue, 10 Ma💻 cs

TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size

TeamHOI is a decentralized framework that leverages a Transformer-based policy and a masked Adversarial Motion Prior strategy to enable a single unified policy to control scalable, physically realistic cooperative human-object interactions among any number of humanoid agents.

Stefan Lionar, Gim Hee LeeTue, 10 Ma💻 cs

← Previous Next →