ULTRA: Unified Multimodal Control for Autonomous Humanoid Whole-Body Loco-Manipulation
The paper introduces ULTRA, a unified framework that combines physics-driven neural retargeting with a multimodal reinforcement learning controller to enable autonomous, goal-conditioned whole-body loco-manipulation on humanoids using sparse high-level task specifications and noisy visual inputs, eliminating the need for predefined motion references.