cs.RO papers | Gist.Science

Contact Coverage-Guided Exploration for General-Purpose Dexterous Manipulation

This paper proposes Contact Coverage-Guided Exploration (CCGE), a general-purpose exploration method that leverages contact state counters and energy-based rewards to guide dexterous hands in discovering diverse contact patterns, thereby significantly improving training efficiency and real-world transferability across complex manipulation tasks.

Zixuan Liu, Ruoyi Qiao, Chenrui Tie, Xuanwei Liu, Yunfan Lou, Chongkai Gao, Zhixuan Xu, Lin ShaoThu, 12 Ma🤖 cs.AI

Learning Adaptive Force Control for Contact-Rich Sample Scraping with Heterogeneous Materials

This paper presents an adaptive force control framework combining a low-level Cartesian impedance controller with a high-level reinforcement learning agent to autonomously scrape heterogeneous materials from vial walls, successfully transferring from simulation to a real Franka robot and outperforming fixed-wrench baselines by 10.9%.

Cenk Cetin, Shreyas Pouli, Gabriella PizzutoThu, 12 Ma💻 cs

PPGuide: Steering Diffusion Policies with Performance Predictive Guidance

This paper introduces PPGuide, a lightweight, classifier-based framework that steers pre-trained diffusion policies away from failure modes at inference time by using a self-supervised performance predictor to provide real-time guidance, thereby improving robustness across diverse robotic manipulation tasks.

Zixing Wang, Devesh K. Jha, Ahmed H. Qureshi, Diego RomeresThu, 12 Ma💻 cs

DynVLA: Learning World Dynamics for Action Reasoning in Autonomous Driving

DynVLA is a novel autonomous driving model that enhances decision-making by introducing "Dynamics CoT," a paradigm that employs a Dynamics Tokenizer to forecast compact, decoupled world dynamics before action generation, thereby outperforming existing textual and visual reasoning methods in accuracy and efficiency.

Shuyao Shang, Bing Zhan, Yunfei Yan, Yuqi Wang, Yingyan Li, Yasong An, Xiaoman Wang, Jierui Liu, Lu Hou, Lue Fan, Zhaoxiang Zhang, Tieniu TanThu, 12 Ma💻 cs

OA-Bug: An Olfactory-Auditory Augmented Bug Algorithm for Swarm Robots in a Denied Environment

This paper proposes the Olfactory-Auditory augmented Bug algorithm (OA-Bug) for swarm robots to effectively explore denied environments without GNSS or central processing, demonstrating through simulations and real-world experiments that it achieves significantly higher search coverage (96.93%) compared to existing methods like SGBA.

Siqi Tan, Xiaoya Zhang, Jingyao Li, Ruitao Jing, Mufan Zhao, Yang Liu, Quan QuanMon, 09 Ma💻 cs

RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model

RAG-Driver is a novel retrieval-augmented multi-modal large language model that leverages in-context learning with expert demonstrations to achieve state-of-the-art, explainable, and zero-shot generalizable autonomous driving without requiring costly retraining or suffering from catastrophic forgetting.

Jianhao Yuan, Shuyang Sun, Daniel Omeiza, Bo Zhao, Paul Newman, Lars Kunze, Matthew GaddMon, 09 Ma🤖 cs.AI

FALCON: Future-Aware Learning with Contextual Object-Centric Pretraining for UAV Action Recognition

FALCON is a unified self-supervised pretraining framework for UAV action recognition that overcomes spatial imbalance in aerial footage by combining object-aware masked autoencoding with object-centric dual-horizon future reconstruction, achieving superior accuracy and faster inference without requiring additional preprocessing at test time.

Ruiqi Xian, Xiyang Wu, Tianrui Guan, Xijun Wang, Boqing Gong, Dinesh ManochaMon, 09 Ma🤖 cs.AI

Integrated Hierarchical Decision-Making in Inverse Kinematic Planning and Control

This paper introduces an efficient and accurate non-linear programming framework that integrates hierarchical decision-making with inverse kinematic planning and control by leveraging sparse structures and the $\ell_0$ -norm to solve complex problems like simultaneous end-effector and grasp selection without relying on heavy mixed-integer computations.

Kai Pfeiffer, Quan Zhang, Yuqing Chen, Gordon Boateng, Yuquan Wang, Vincent Bonnet, Aberrahmane KheddarMon, 09 Ma💻 cs

Generative Predictive Control: Flow Matching Policies for Dynamic and Difficult-to-Demonstrate Tasks

This paper introduces Generative Predictive Control, a supervised learning framework that leverages flow matching and sampling-based predictive control to enable high-frequency, dynamic robotic tasks by eliminating the need for difficult-to-obtain expert demonstrations.

Vince Kurtz, Joel W. BurdickMon, 09 Ma🤖 cs.AI

CAPS: Context-Aware Priority Sampling for Enhanced Imitation Learning in Autonomous Driving

This paper introduces Context-Aware Priority Sampling (CAPS), a novel imitation learning method that leverages VQ-VAEs to cluster and re-balance training data, thereby improving the generalization, driving score, and success rate of autonomous driving systems in CARLA simulations.

Hamidreza Mirkhani, Behzad Khamidehi, Ehsan Ahmadi, Mohammed Elmahgiubi, Weize Zhang, Fazel Arasteh, Umar Rajguru, Kasra Rezaee, Dongfeng BaiMon, 09 Ma🤖 cs.LG

Whole-Body Model-Predictive Control of Legged Robots with MuJoCo

This paper demonstrates that a simple iterative LQR algorithm using MuJoCo dynamics and finite-difference derivatives can achieve effective, real-time whole-body model-predictive control for quadruped and humanoid robots in the real world with minimal sim-to-real tuning, thereby lowering the barrier for future research.

John Z. Zhang, Taylor A. Howell, Zeji Yi, Chaoyi Pan, Guanya Shi, Guannan Qu, Tom Erez, Yuval Tassa, Zachary ManchesterMon, 09 Ma💻 cs

Graph-based Online Lidar Odometry with Retrospective Map Refinement

This paper presents a graph-based online Lidar odometry method that enhances trajectory estimation and map accuracy by registering scans against multiple overlapping submaps and performing retrospective refinement of their anchor points, achieving superior performance on automotive datasets while maintaining real-time operation.

Aaron Kurda, Simon Steuernagel, Marcus BaumMon, 09 Ma💻 cs

FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment

FindAnything is an efficient, open-world mapping framework that integrates vision-language features into object-centric volumetric submaps to enable real-time, open-vocabulary semantic understanding of large-scale environments on resource-constrained robots.

Sebastián Barbas Laina, Simon Boche, Sotiris Papatheodorou, Simon Schaefer, Jaehyung Jung, Helen Oleynikova, Stefan LeuteneggerMon, 09 Ma🤖 cs.AI

Robustness-Aware Tool Selection and Manipulation Planning with Learned Energy-Informed Guidance

This paper introduces a robustness-aware framework that jointly selects tools and plans contact-rich manipulation trajectories by leveraging an energy-based metric to optimize for disturbance resilience in robotic tool-use tasks.

Yifei Dong, Yan Zhang, Sylvain Calinon, Florian T. PokornyMon, 09 Ma💻 cs

ROS-related Robotic Systems Development with V-model-based Application of MeROS Metamodel

This paper proposes a structured methodology that integrates the Robot Operating System (ROS) with Model-Based Systems Engineering (MBSE) through a specialized SysML metamodel called MeROS and an adapted V-model, aiming to enhance the semantic coherence, structural traceability, and reliable coordination of complex heterogeneous robotic systems.

Tomasz Winiarski, Jan Kaniuka, Daniel Giełdowski, Jakub Ostrysz, Krystian Radlak, Dmytro KushnirMon, 09 Ma💻 cs

Diverse and Adaptive Behavior Curriculum for Autonomous Driving: A Student-Teacher Framework with Multi-Agent RL

This paper proposes a novel student-teacher framework for autonomous driving that utilizes a graph-based multi-agent RL teacher to automatically generate diverse, adaptive traffic curricula, enabling a student agent to achieve superior robustness and balanced driving performance compared to traditional rule-based approaches.

Ahmed Abouelazm, Johannes Ratz, Philip Schörner, J. Marius ZöllnerMon, 09 Ma🤖 cs.LG

Bridging Simulation and Usability: A User-Friendly Framework for Scenario Generation in CARLA

This paper introduces an interactive, no-code framework with a graphical interface and graph-based representation to democratize scenario generation for autonomous driving validation in CARLA, enabling non-technical users to efficiently create, manage, and execute diverse test scenarios without programming expertise.

Ahmed Abouelazm, Mohammad Mahmoud, Conrad Walter, Oleksandr Shchetsura, Erne Hussong, Helen Gremmelmaier, J. Marius ZöllnerMon, 09 Ma💻 cs

VEGA: Electric Vehicle Navigation Agent via Physics-Informed Neural Operator and Proximal Policy Optimization

VEGA is an electric vehicle navigation system that combines a physics-informed neural operator for real-time vehicle parameter estimation with a Proximal Policy Optimization agent for efficient, charge-aware route and charging stop planning, demonstrating superior inference speed and generalization across international road networks compared to traditional energy-aware baselines.

Hansol Lim, Minhyeok Im, Jonathan Boyack, Jee Won Lee, Jongseong Brad ChoiMon, 09 Ma🤖 cs.LG

Language Conditioning Improves Accuracy of Aircraft Goal Prediction in Non-Towered Airspace

This paper presents a multimodal framework that integrates natural language understanding of pilot radio calls with trajectory data to significantly improve the accuracy of aircraft goal prediction in non-towered airspace compared to motion-only baselines.

Sundhar Vinodh Sangeetha, Chih-Yuan Chiu, Sarah H. Q. Li, Shreyas KousikMon, 09 Ma💻 cs

GLIDE: A Coordinated Aerial-Ground Framework for Search and Rescue in Unknown Environments

The paper presents GLIDE, a cooperative search-and-rescue framework that pairs two specialized UAVs with a UGV to enable rapid, safe navigation in unknown environments through real-time victim detection, terrain scouting, and guided long-horizon planning.

Seth Farrell, Chenghao Li, Hesam Mojtahedi, Henrik I. ChristensenMon, 09 Ma💻 cs

← Previous Next →