cs.RO papers | Gist.Science

Real-time tightly coupled GNSS and IMU integration via Factor Graph Optimization

This paper presents a real-time, tightly coupled GNSS-IMU integration method based on factor graph optimization that utilizes incremental optimization with fixed-lag marginalization to achieve robust, high-accuracy positioning in dense urban environments, as validated by experiments on the UrbanNav dataset.

Radu-Andrei Cioaca, Paul Irofti, Cristian Rusu + 3 more2026-03-05🤖 cs.LG

MEM: Multi-Scale Embodied Memory for Vision Language Action Models

The paper introduces Multi-Scale Embodied Memory (MEM), a mixed-modal architecture that integrates video-based short-term and text-based long-term memory to enable vision-language-action models to effectively perform complex, long-horizon robotic tasks spanning up to fifteen minutes.

Marcel Torne, Karl Pertsch, Homer Walke + 14 more2026-03-05🤖 cs.LG

UrbanHuRo: A Two-Layer Human-Robot Collaboration Framework for the Joint Optimization of Heterogeneous Urban Services

This paper proposes UrbanHuRo, a two-layer human-robot collaboration framework that jointly optimizes heterogeneous urban services like crowdsourced delivery and sensing through scalable order dispatch and deep reinforcement learning, achieving significant improvements in sensing coverage, courier income, and order timeliness.

Tonmoy Dey, Lin Jiang, Zheng Dong + 1 more2026-03-05🤖 cs.AI

Large-Language-Model-Guided State Estimation for Partially Observable Task and Motion Planning

This paper presents CoCo-TAMP, a hierarchical state estimation framework that leverages large language models to incorporate common-sense knowledge about object locations and co-occurrence, significantly reducing planning and execution time for robots operating in partially observable environments.

Yoonwoo Kim, Raghav Arora, Roberto Martín-Martín + 3 more2026-03-05🤖 cs.AI

HALyPO: Heterogeneous-Agent Lyapunov Policy Optimization for Human-Robot Collaboration

This paper proposes HALyPO, a novel multi-agent reinforcement learning framework that ensures stable and generalizable human-robot collaboration by enforcing Lyapunov-based stability conditions on policy parameters to bridge the rationality gap between heterogeneous agents.

Hao Zhang, Yaru Niu, Yikai Wang + 2 more2026-03-05🤖 cs.AI

RAGNav: A Retrieval-Augmented Topological Reasoning Framework for Multi-Goal Visual-Language Navigation

RAGNav is a novel framework for Multi-Goal Visual-Language Navigation that integrates a Dual-Basis Memory system combining topological maps and semantic forests with anchor-guided retrieval and neighbor score propagation to overcome spatial hallucinations and enhance sequential planning efficiency, achieving state-of-the-art performance.

Ling Luo, Qiangian Bai2026-03-05🤖 cs.AI

Interaction-Aware Whole-Body Control for Compliant Object Transport

This paper presents a bio-inspired, interaction-oriented whole-body control framework that combines a trajectory-optimized reference generator with a reinforcement learning policy trained via asymmetric teacher-student distillation to enable assistive humanoids to maintain stable balance and compliant object transport in unstructured environments despite strong, time-varying interaction forces.

Hao Zhang, Yves Tseng, Ding Zhao + 1 more2026-03-05🤖 cs.AI

Cognition to Control - Multi-Agent Learning for Human-Humanoid Collaborative Transport

This paper introduces Cognition-to-Control (C2C), a three-layer hierarchical framework that bridges high-level deliberation and low-level execution for human-humanoid collaborative transport by integrating a VLM-based grounding layer, a decentralized multi-agent reinforcement learning coordination layer, and a whole-body control layer to achieve robust, stable, and adaptive joint manipulation.

Hao Zhang, Ding Zhao, H. Eric Tseng2026-03-05🤖 cs.AI

Pretrained Vision-Language-Action Models are Surprisingly Resistant to Forgetting in Continual Learning

This paper demonstrates that large-scale pretrained Vision-Language-Action models exhibit remarkable resistance to catastrophic forgetting during continual learning, often achieving zero forgetting with simple experience replay and enabling rapid skill recovery through fine-tuning, a capability that fundamentally differs from smaller models trained from scratch.

Huihan Liu, Changyeon Kim, Bo Liu + 2 more2026-03-05🤖 cs.AI

IROSA: Interactive Robot Skill Adaptation using Natural Language

This paper presents IROSA, a novel framework that leverages pre-trained large language models to enable open-vocabulary, safe, and interpretable robot skill adaptation for industrial tasks through a tool-based architecture that avoids direct model-to-robot interaction or fine-tuning.

Markus Knauer, Samuel Bustamante, Thomas Eiband + 3 more2026-03-05🤖 cs.AI

RVN-Bench: A Benchmark for Reactive Visual Navigation

The paper introduces RVN-Bench, a new collision-aware benchmark built on Habitat 2.0 and HM3D scenes that enables the training and evaluation of safe, robust indoor visual navigation policies for mobile robots in unseen, cluttered environments.

Jaewon Lee, Jaeseok Heo, Gunmin Lee + 3 more2026-03-05🤖 cs.AI

Right in Time: Reactive Reasoning in Regulated Traffic Spaces

This paper introduces a reactive mission design framework that combines Probabilistic Mission Design with Reactive Circuits to enable efficient, online exact probabilistic inference for autonomous agents in regulated traffic spaces, achieving significant speedups over prior methods by dynamically re-evaluating only the components affected by changing sensor data.

Simon Kohaut, Benedict Flade, Julian Eggert + 2 more2026-03-05🤖 cs.AI

Self-adapting Robotic Agents through Online Continual Reinforcement Learning with World Model Feedback

This paper proposes a biologically inspired framework for online Continual Reinforcement Learning that leverages world model prediction residuals to automatically detect environmental changes and trigger self-adapting finetuning in robotic agents, enabling them to improve their performance during deployment without external supervision.

Fabian Domberg, Georg Schildbach2026-03-05🤖 cs.AI

Sim2Sea: Sim-to-Real Policy Transfer for Maritime Vessel Navigation in Congested Waters

The paper proposes Sim2Sea, a comprehensive framework featuring a GPU-accelerated simulator, a dual-stream spatiotemporal policy with safety-guided action masking, and targeted domain randomization, which successfully enables zero-shot sim-to-real transfer of autonomous navigation for a 17-ton unmanned vessel in congested maritime waters.

Xinyu Cui, Xuanfa Jin, Xue Yan + 7 more2026-03-05🤖 cs.AI

SaFeR: Safety-Critical Scenario Generation for Autonomous Driving Test via Feasibility-Constrained Token Resampling

SaFeR is a novel framework for generating safety-critical autonomous driving test scenarios that balances adversarial criticality, physical feasibility, and behavioral realism by employing a Transformer-based realism prior with a differential attention mechanism and a feasibility-constrained token resampling strategy derived from offline reinforcement learning.

Jinlong Cui, Fenghua Liang, Guo Yang + 2 more2026-03-05🤖 cs.AI

GarmentPile++: Affordance-Driven Cluttered Garments Retrieval with Vision-Language Reasoning

GarmentPile++ is a novel pipeline that integrates vision-language reasoning with visual affordance perception and dual-arm cooperation to enable safe, precise retrieval of single garments from cluttered piles, bridging the gap between single-garment manipulation research and real-world scenarios.

Mingleyang Li, Yuran Wang, Yue Chen + 6 more2026-03-05🤖 cs.AI

Learning Hip Exoskeleton Control Policy via Predictive Neuromusculoskeletal Simulation

This paper presents a physics-based neuromusculoskeletal learning framework that trains a hip-exoskeleton control policy entirely in simulation using reinforcement learning and muscle-synergy priors, successfully transferring the policy to hardware without motion-capture data or additional tuning while achieving significant reductions in muscle activation and joint power across diverse walking conditions.

Ilseung Park, Changseob Song, Inseung Kang2026-03-05🤖 cs.LG

PRAM-R: A Perception-Reasoning-Action-Memory Framework with LLM-Guided Modality Routing for Adaptive Autonomous Driving

This paper introduces PRAM-R, a unified framework that leverages an LLM-guided router and hierarchical memory within an asynchronous dual-loop architecture to dynamically optimize sensor modality usage, significantly reducing computational costs and routing instability while maintaining high trajectory accuracy in autonomous driving.

Yi Zhang, Xian Zhang, Saisi Zhao + 4 more2026-03-05🤖 cs.AI

VANGUARD: Vehicle-Anchored Ground Sample Distance Estimation for UAVs in GPS-Denied Environments

This paper introduces VANGUARD, a lightweight geometric perception tool that enables LLM-based UAV agents operating in GPS-denied environments to accurately estimate Ground Sample Distance and recover metric scale by leveraging detected vehicles as environmental anchors, thereby significantly reducing spatial hallucinations and catastrophic failures compared to state-of-the-art vision-language models.

Yifei Chen, Xupeng Chen, Feng Wang + 2 more2026-03-05🤖 cs.AI

RoboCasa365: A Large-Scale Simulation Framework for Training and Benchmarking Generalist Robots

This paper introduces RoboCasa365, a large-scale simulation framework featuring 365 everyday tasks across 2,500 diverse kitchen environments and extensive human and synthetic demonstration data, designed to provide a reproducible benchmark for evaluating and advancing generalist robot policies through systematic analysis of task diversity, dataset scale, and environment variation.

Soroush Nasiriany, Sepehr Nasiriany, Abhiram Maddukuri + 1 more2026-03-05🤖 cs.AI

← Previous Next →