cs.RO papers | Gist.Science

The Neural Compass: Probabilistic Relative Feature Fields for Robotic Search

This paper introduces ProReFF, a feature field model that learns relative object co-occurrence distributions from unlabeled observations to guide robotic search agents, achieving 20% higher efficiency than strong baselines and up to 80% of human performance in the Matterport3D simulator.

Gabriele Somaschini, Adrian Röfer, Abhinav Valada2026-03-10🤖 cs.LG

Interactive World Simulator for Robot Policy Training and Evaluation

This paper presents the Interactive World Simulator, a fast and physically consistent framework leveraging consistency models to generate high-fidelity long-horizon video predictions that enable scalable robot policy training and reliable real-world evaluation using solely simulated data.

Yixuan Wang, Rhythm Syed, Fangyu Wu, Mengchao Zhang, Aykut Onol, Jose Barreiros, Hooshang Nayyeri, Tony Dear, Huan Zhang, Yunzhu Li2026-03-10🤖 cs.LG

OA-Bug: An Olfactory-Auditory Augmented Bug Algorithm for Swarm Robots in a Denied Environment

This paper proposes the Olfactory-Auditory augmented Bug algorithm (OA-Bug) for swarm robots to effectively explore denied environments without GNSS or central processing, demonstrating through simulations and real-world experiments that it achieves significantly higher search coverage (96.93%) compared to existing methods like SGBA.

Siqi Tan, Xiaoya Zhang, Jingyao Li, Ruitao Jing, Mufan Zhao, Yang Liu, Quan Quan2026-03-09💻 cs

RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model

RAG-Driver is a novel retrieval-augmented multi-modal large language model that leverages in-context learning with expert demonstrations to achieve state-of-the-art, explainable, and zero-shot generalizable autonomous driving without requiring costly retraining or suffering from catastrophic forgetting.

Jianhao Yuan, Shuyang Sun, Daniel Omeiza, Bo Zhao, Paul Newman, Lars Kunze, Matthew Gadd2026-03-09🤖 cs.AI

FALCON: Future-Aware Learning with Contextual Object-Centric Pretraining for UAV Action Recognition

FALCON is a unified self-supervised pretraining framework for UAV action recognition that overcomes spatial imbalance in aerial footage by combining object-aware masked autoencoding with object-centric dual-horizon future reconstruction, achieving superior accuracy and faster inference without requiring additional preprocessing at test time.

Ruiqi Xian, Xiyang Wu, Tianrui Guan, Xijun Wang, Boqing Gong, Dinesh Manocha2026-03-09🤖 cs.AI

Integrated Hierarchical Decision-Making in Inverse Kinematic Planning and Control

This paper introduces an efficient and accurate non-linear programming framework that integrates hierarchical decision-making with inverse kinematic planning and control by leveraging sparse structures and the $\ell_0$ -norm to solve complex problems like simultaneous end-effector and grasp selection without relying on heavy mixed-integer computations.

Kai Pfeiffer, Quan Zhang, Yuqing Chen, Gordon Boateng, Yuquan Wang, Vincent Bonnet, Aberrahmane Kheddar2026-03-09💻 cs

Generative Predictive Control: Flow Matching Policies for Dynamic and Difficult-to-Demonstrate Tasks

This paper introduces Generative Predictive Control, a supervised learning framework that leverages flow matching and sampling-based predictive control to enable high-frequency, dynamic robotic tasks by eliminating the need for difficult-to-obtain expert demonstrations.

Vince Kurtz, Joel W. Burdick2026-03-09🤖 cs.AI

CAPS: Context-Aware Priority Sampling for Enhanced Imitation Learning in Autonomous Driving

This paper introduces Context-Aware Priority Sampling (CAPS), a novel imitation learning method that leverages VQ-VAEs to cluster and re-balance training data, thereby improving the generalization, driving score, and success rate of autonomous driving systems in CARLA simulations.

Hamidreza Mirkhani, Behzad Khamidehi, Ehsan Ahmadi, Mohammed Elmahgiubi, Weize Zhang, Fazel Arasteh, Umar Rajguru, Kasra Rezaee, Dongfeng Bai2026-03-09🤖 cs.LG

Whole-Body Model-Predictive Control of Legged Robots with MuJoCo

This paper demonstrates that a simple iterative LQR algorithm using MuJoCo dynamics and finite-difference derivatives can achieve effective, real-time whole-body model-predictive control for quadruped and humanoid robots in the real world with minimal sim-to-real tuning, thereby lowering the barrier for future research.

John Z. Zhang, Taylor A. Howell, Zeji Yi, Chaoyi Pan, Guanya Shi, Guannan Qu, Tom Erez, Yuval Tassa, Zachary Manchester2026-03-09💻 cs

Graph-based Online Lidar Odometry with Retrospective Map Refinement

This paper presents a graph-based online Lidar odometry method that enhances trajectory estimation and map accuracy by registering scans against multiple overlapping submaps and performing retrospective refinement of their anchor points, achieving superior performance on automotive datasets while maintaining real-time operation.

Aaron Kurda, Simon Steuernagel, Marcus Baum2026-03-09💻 cs

FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment

FindAnything is an efficient, open-world mapping framework that integrates vision-language features into object-centric volumetric submaps to enable real-time, open-vocabulary semantic understanding of large-scale environments on resource-constrained robots.

Sebastián Barbas Laina, Simon Boche, Sotiris Papatheodorou, Simon Schaefer, Jaehyung Jung, Helen Oleynikova, Stefan Leutenegger2026-03-09🤖 cs.AI

Robustness-Aware Tool Selection and Manipulation Planning with Learned Energy-Informed Guidance

This paper introduces a robustness-aware framework that jointly selects tools and plans contact-rich manipulation trajectories by leveraging an energy-based metric to optimize for disturbance resilience in robotic tool-use tasks.

Yifei Dong, Yan Zhang, Sylvain Calinon, Florian T. Pokorny2026-03-09💻 cs

ROS-related Robotic Systems Development with V-model-based Application of MeROS Metamodel

This paper proposes a structured methodology that integrates the Robot Operating System (ROS) with Model-Based Systems Engineering (MBSE) through a specialized SysML metamodel called MeROS and an adapted V-model, aiming to enhance the semantic coherence, structural traceability, and reliable coordination of complex heterogeneous robotic systems.

Tomasz Winiarski, Jan Kaniuka, Daniel Giełdowski, Jakub Ostrysz, Krystian Radlak, Dmytro Kushnir2026-03-09💻 cs

Diverse and Adaptive Behavior Curriculum for Autonomous Driving: A Student-Teacher Framework with Multi-Agent RL

This paper proposes a novel student-teacher framework for autonomous driving that utilizes a graph-based multi-agent RL teacher to automatically generate diverse, adaptive traffic curricula, enabling a student agent to achieve superior robustness and balanced driving performance compared to traditional rule-based approaches.

Ahmed Abouelazm, Johannes Ratz, Philip Schörner, J. Marius Zöllner2026-03-09🤖 cs.LG

Bridging Simulation and Usability: A User-Friendly Framework for Scenario Generation in CARLA

This paper introduces an interactive, no-code framework with a graphical interface and graph-based representation to democratize scenario generation for autonomous driving validation in CARLA, enabling non-technical users to efficiently create, manage, and execute diverse test scenarios without programming expertise.

Ahmed Abouelazm, Mohammad Mahmoud, Conrad Walter, Oleksandr Shchetsura, Erne Hussong, Helen Gremmelmaier, J. Marius Zöllner2026-03-09💻 cs

VEGA: Electric Vehicle Navigation Agent via Physics-Informed Neural Operator and Proximal Policy Optimization

VEGA is an electric vehicle navigation system that combines a physics-informed neural operator for real-time vehicle parameter estimation with a Proximal Policy Optimization agent for efficient, charge-aware route and charging stop planning, demonstrating superior inference speed and generalization across international road networks compared to traditional energy-aware baselines.

Hansol Lim, Minhyeok Im, Jonathan Boyack, Jee Won Lee, Jongseong Brad Choi2026-03-09🤖 cs.LG

Language Conditioning Improves Accuracy of Aircraft Goal Prediction in Non-Towered Airspace

This paper presents a multimodal framework that integrates natural language understanding of pilot radio calls with trajectory data to significantly improve the accuracy of aircraft goal prediction in non-towered airspace compared to motion-only baselines.

Sundhar Vinodh Sangeetha, Chih-Yuan Chiu, Sarah H. Q. Li, Shreyas Kousik2026-03-09💻 cs

GLIDE: A Coordinated Aerial-Ground Framework for Search and Rescue in Unknown Environments

The paper presents GLIDE, a cooperative search-and-rescue framework that pairs two specialized UAVs with a UGV to enable rapid, safe navigation in unknown environments through real-time victim detection, terrain scouting, and guided long-horizon planning.

Seth Farrell, Chenghao Li, Hesam Mojtahedi, Henrik I. Christensen2026-03-09💻 cs

Decision-Driven Semantic Object Exploration for Legged Robots via Confidence-Calibrated Perception and Topological Subgoal Selection

This paper presents a vision-based framework for legged robots that enables robust decision-driven semantic exploration by integrating confidence-calibrated perception, controlled-growth topological memory, and utility-driven subgoal selection to overcome the limitations of conventional geometry-centric navigation in open-world environments.

Guoyang Zhao, Yudong Li, Weiqing Qi, Kai Zhang, Bonan Liu, Kai Chen, Haoang Li, Jun Ma2026-03-09💻 cs

Taxonomy-aware Dynamic Motion Generation on Hyperbolic Manifolds

This paper introduces GPHDM, a novel framework that extends Gaussian Process Dynamical Models to hyperbolic manifolds to generate physically consistent, human-like robot motions by preserving the hierarchical taxonomy and temporal dynamics of movement.

Luis Augenstein, Noémie Jaquier, Tamim Asfour, Leonel Rozo2026-03-09🤖 cs.LG

← Previous Next →