Contact-Grounded Policy: Dexterous Visuotactile Policy with Generative Contact Grounding

The paper introduces Contact-Grounded Policy (CGP), a dexterous visuotactile manipulation framework that leverages generative contact grounding to predict robot states and tactile feedback, converting them into executable targets for compliance control across both simulated and real-world tasks.

Zhengtong Xu, Yeping Wang, Ben Abbatematteo, Jom Preechayasomboon, Sonny Chan, Nick Colonnese, Amirhossein H. Memar2026-03-09💻 cs

Environment-Aware Path Generation for Robotic Additive Manufacturing of Structures

This paper proposes and evaluates an environment-aware path generation framework for robotic additive manufacturing that utilizes four distinct path planning algorithms to enable online structure design in dynamic, obstacle-rich environments, while establishing structural and computational metrics to identify the most effective planners for challenging construction scenarios.

Mahsa Rabiei, Reza Moini2026-03-09💻 cs

EmboAlign: Aligning Video Generation with Compositional Constraints for Zero-Shot Manipulation

EmboAlign is a data-free framework that enhances zero-shot robotic manipulation by leveraging vision-language models to extract compositional constraints, which are then used to select physically plausible video generation rollouts and refine robot trajectories, thereby significantly improving task success rates without requiring task-specific training data.

Gehao Zhang, Zhenyang Ni, Payal Mohapatra, Han Liu, Ruohan Zhang, Qi Zhu2026-03-09💻 cs

Multi-Robot Trajectory Planning via Constrained Bayesian Optimization and Local Cost Map Learning with STL-Based Conflict Resolution

This paper proposes a two-stage framework combining constrained Bayesian Optimization-based Tree search (cBOT) for efficient single-robot trajectory generation and an STL-enhanced Kinodynamic Conflict-Based Search (STL-KCBS) for scalable multi-robot coordination, effectively addressing motion planning under Signal Temporal Logic specifications and kinodynamic constraints with demonstrated improvements in efficiency, safety, and real-world applicability.

Sourav Raxit, Abdullah Al Redwan Newaz, Jose Fuentes, Paulo Padrao, Ana Cavalcanti, Leonardo Bobadilla2026-03-09💻 cs

Task-Level Decisions to Gait Level Control: A Hierarchical Policy Approach for Quadruped Navigation

This paper introduces TDGC, a hierarchical policy framework that bridges the gap between high-level navigation and low-level gait control for quadrupeds by using a task-level decision module to generate adaptable low-level targets, thereby improving robustness, sim-to-real transfer, and performance on mixed and out-of-distribution terrains.

Sijia Li, Haoyu Wang, Shenghai Yuan, Yizhuo Yang, Thien-Minh Nguyen2026-03-09💻 cs

OpenHEART: Opening Heterogeneous Articulated Objects with a Legged Manipulator

This paper presents OpenHEART, a robust and sample-efficient framework that enables legged manipulators to open diverse heterogeneous articulated objects by utilizing Sampling-based Abstracted Feature Extraction (SAFE) for compact geometric encoding and an Articulation Information Estimator (ArtIEst) for adaptive state estimation.

Seonghyeon Lim, Hyeonwoo Lee, Seunghyun Lee, I Made Aswin Nahrendra, Hyun Myung2026-03-09💻 cs

Expert Knowledge-driven Reinforcement Learning for Autonomous Racing via Trajectory Guidance and Dynamics Constraints

This paper proposes TraD-RL, a reinforcement learning framework for autonomous racing that integrates expert trajectory guidance, control barrier function-based safety constraints, and a multi-stage curriculum learning strategy to achieve stable training, safe operation, and performance surpassing expert levels in high-dynamic environments.

Bo Leng, Weiqi Zhang, Zhuoren Li, Lu Xiong, Guizhe Jin, Ran Yu, Chen Lv2026-03-09💻 cs

AnyCamVLA: Zero-Shot Camera Adaptation for Viewpoint Robust Vision-Language-Action Models

The paper proposes AnyCamVLA, a zero-shot framework that enhances the viewpoint robustness of pre-trained Vision-Language-Action models by virtually synthesizing test-time camera observations to match training configurations, thereby eliminating the need for fine-tuning, additional data, or architectural changes.

Hyeongjun Heo, Seungyeon Woo, Sang Min Kim, Junho Kim, Junho Lee, Yonghyeon Lee, Young Min Kim2026-03-09💻 cs