Multimodal Adversarial Quality Policy for Safe Grasping

This paper proposes the Multimodal Adversarial Quality Policy (MAQP), a framework that enhances safe robot grasping in human-robot interaction by introducing a Heterogeneous Dual-Patch Optimization Scheme and a Gradient-Level Modality Balancing Strategy to effectively generate multimodal adversarial patches that address distribution discrepancies and optimization imbalances between RGB and depth modalities.

Kunlin Xie, Chenghao Li, Haolan Zhang, Nak Young ChongWed, 11 Ma💻 cs

A 26-Gram Butterfly-Inspired Robot Achieving Autonomous Tailless Flight

This paper introduces \textit{AirPulse}, a 26-gram butterfly-inspired robot that achieves the first autonomous, closed-loop tailless flight at this scale by replicating low-frequency, high-amplitude biomechanical traits through a hierarchical control architecture featuring Stroke Timing Asymmetry Rhythm (STAR).

Weibin Gu, Chenrui Feng, Lian Liu, Chen Yang, Xingchi Jiao, Yuhe Ding, Xiaofei Shi, Chao Gao, Alessandro Rizzo, Guyue ZhouWed, 11 Ma💻 cs

UniBYD: A Unified Framework for Learning Robotic Manipulation Across Embodiments Beyond Imitation of Human Demonstrations

UniBYD is a unified framework that leverages a unified morphological representation and a dynamic reinforcement learning algorithm with a hybrid shadow engine to bridge the embodiment gap, enabling robotic hands to transcend human imitation and discover manipulation policies optimally adapted to their own physical morphologies.

Tingyu Yuan, Biaoliang Guan, Wen Ye, Ziyan Tian, Yi Yang, Weijie Zhou, Zhaowen Li, Yan Huang, Peng Wang, Chaoyang Zhao, Jinqiao WangWed, 11 Ma💻 cs

Bootstrap Dynamic-Aware 3D Visual Representation for Scalable Robot Learning

The paper introduces AFRO, a self-supervised framework that learns dynamics-aware 3D visual representations by modeling state-action-state transitions via a generative diffusion process, thereby significantly improving robotic manipulation performance across diverse simulated and real-world tasks without requiring explicit action or reconstruction supervision.

Qiwei Liang, Boyang Cai, Minghao Lai, Sitong Zhuang, Tao Lin, Yan Qin, Yixuan Ye, Jiaming Liang, Renjing XuWed, 11 Ma💻 cs

Revisiting Replanning from Scratch: Real-Time Incremental Planning with Fast Almost-Surely Asymptotically Optimal Planners

This paper challenges the conventional assumption that reactive replanning requires updating existing plans by demonstrating that using fast almost-surely asymptotically optimal (ASAO) algorithms to solve a series of independent planning problems offers a more efficient and effective approach for navigating changing environments.

Mitchell E. C. Sabbadini, Andrew H. Liu, Joseph Ruan, Tyler S. Wilson, Zachary Kingston, Jonathan D. GammellWed, 11 Ma💻 cs

Automated Coral Spawn Monitoring for Reef Restoration: The Coral Spawn and Larvae Imaging Camera System (CSLICS)

This paper introduces the Coral Spawn and Larvae Imaging Camera System (CSLICS), an automated, low-cost computer vision solution that significantly reduces labor-intensive manual counting while accurately monitoring coral spawn and larvae to enhance reef restoration efforts.

Dorian Tsai, Christopher A. Brunner, Riki Lamont, F. Mikaela Nordborg, Andrea Severati, Java Terry, Karen Jackel, Matthew Dunbabin, Tobias Fischer, Scarlett RaineWed, 11 Ma💻 cs

Unveiling the Potential of iMarkers: Invisible Fiducial Markers for Advanced Robotics

This paper introduces iMarkers, a novel class of invisible fiducial markers detectable only by robots and AR devices, which overcome the visual aesthetic limitations of traditional markers while offering customizable production, robust detection algorithms, and proven effectiveness across diverse robotics scenarios.

Ali Tourani, Deniz Isinsu Avsar, Hriday Bavle, Jose Luis Sanchez-Lopez, Jan Lagerwall, Holger VoosWed, 11 Ma💻 cs

Open-World Task and Motion Planning via Vision-Language Model Genereated Constraints

The paper introduces OWL-TAMP, a novel framework that integrates Vision-Language Models into Task and Motion Planning systems to generate language-parameterized discrete and continuous constraints, enabling robots to solve complex, long-horizon manipulation tasks specified in natural language within open-world environments.

Nishanth Kumar, William Shen, Fabio Ramos, Dieter Fox, Tomás Lozano-Pérez, Leslie Pack Kaelbling, Caelan Reed GarrettWed, 11 Ma💻 cs

TiPToP: A Modular Open-Vocabulary Planning System for Robotic Manipulation

TiPToP is a modular, open-vocabulary robotic planning system that integrates pretrained vision foundation models with a Task and Motion Planner to solve multi-step manipulation tasks from RGB images and natural language instructions without requiring any robot-specific training data, achieving performance comparable to or better than fine-tuned vision-language-action models while enabling detailed failure mode analysis.

William Shen, Nishanth Kumar, Sahit Chintalapudi, Jie Wang, Christopher Watson, Edward Hu, Jing Cao, Dinesh Jayaraman, Leslie Pack Kaelbling, Tomás Lozano-PérezWed, 11 Ma💻 cs

Kinodynamic Motion Retargeting for Humanoid Locomotion via Multi-Contact Whole-Body Trajectory Optimization

This paper introduces KDMR, a novel framework that formulates humanoid motion retargeting as a multi-contact whole-body trajectory optimization problem incorporating rigid-body dynamics and ground reaction forces to generate physically consistent, dynamically feasible locomotion trajectories that significantly outperform purely kinematic methods in both motion quality and downstream control policy performance.

Xiaoyu Zhang, Steven Haener, Varun Madabushi, Maegan TuckerWed, 11 Ma💻 cs

Robust Cooperative Localization in Featureless Environments: A Comparative Study of DCL, StCL, CCL, CI, and Standard-CL

This paper presents a comparative study of five cooperative localization algorithms in featureless, GPS-denied environments, revealing that while Sequential and Standard methods offer high accuracy at the cost of filter inconsistency, Covariance Intersection provides the most balanced trade-off between accuracy and robustness for safety-critical applications.

Nivand Khosravi, Meysam Basiri, Rodrigo VenturaWed, 11 Ma💻 cs

TIMID: Time-Dependent Mistake Detection in Videos of Robot Executions

This paper introduces TIMID, a weakly supervised video anomaly detection framework that leverages task and mistake prompts to detect complex, time-dependent errors in robot executions, addressing the limitations of existing models and out-of-the-box VLMs through a novel multi-robot simulation dataset for zero-shot evaluation.

Nerea Gallego (University of Zaragoza), Fernando Salanova (University of Zaragoza), Claudio Mannarano (University of Zaragoza, University of Torino), Cristian Mahulea (University of Zaragoza), Eduardo Montijano (University of Zaragoza)Wed, 11 Ma💻 cs

MuxGel: Simultaneous Dual-Modal Visuo-Tactile Sensing via Spatially Multiplexing and Deep Reconstruction

MuxGel is a spatially multiplexed visuo-tactile sensor that overcomes the opacity trade-off in existing GelSight-style devices by using a checkerboard coating to simultaneously capture pre-contact vision and post-contact tactile signals through a single camera, with high-fidelity reconstruction achieved via a deep learning framework.

Zhixian Hu, Zhengtong Xu, Sheeraz Athar, Juan Wachs, Yu SheWed, 11 Ma💻 cs