cs.RO papers | Gist.Science

Overcoming Visual Clutter in Vision Language Action Models via Concept-Gated Visual Distillation

This paper introduces Concept-Gated Visual Distillation (CGVD), a training-free, model-agnostic inference framework that overcomes the "Precision-Reasoning Gap" in Vision-Language-Action models by parsing instructions to identify distractors and using Fourier-based inpainting to generate clean observations, thereby significantly improving robotic manipulation success rates in highly cluttered environments.

Sangmim Song, Sarath Kodagoda, Marc Carmichael, Karthick ThiyagarajanThu, 12 Ma⚡ eess

Adaptive Manipulation Potential and Haptic Estimation for Tool-Mediated Interaction

This paper presents a closed-loop framework for tool-mediated manipulation that utilizes a parameterized Equilibrium Manifold and a hybrid haptic SLAM strategy to enable adaptive stiffness control and robust online planning, successfully demonstrated through extensive real-world screw-loosening trials.

Lin Yang, Anirvan Dutta, Yuan Ji, Yanxin Zhou, Shilin Shan, Lv Chen, Etienne Burdet, Domenico CampoloThu, 12 Ma💻 cs

Few-Shot Adaptation to Non-Stationary Environments via Latent Trend Embedding for Robotics

This paper proposes a scalable and interpretable framework for few-shot robotic adaptation to non-stationary environments that estimates a low-dimensional, temporally regularized "Trend ID" via backpropagation while keeping model parameters fixed, thereby avoiding catastrophic forgetting and high computational costs.

Yasuyuki Fujii (College of Information Science and Engineering, Ritsumeikan University, Osaka, Japan), Emika Kameda (College of Information Science and Engineering, Ritsumeikan University, Osaka, Japan), Hiroki Fukada (Production and Technology Department, NIPPN CORPORATION, Tokyo, Japan), Yoshiki Mori (University of Osaka, Osaka, Japan), Tadashi Matsuo (National Institute of Technology, Ichinoseki College, Iwate, Japan), Nobutaka Shimada (College of Information Science and Engineering, Ritsumeikan University, Osaka, Japan)Thu, 12 Ma🤖 cs.AI

ScanDP: Generalizable 3D Scanning with Diffusion Policy

ScanDP is a data-efficient 3D scanning framework that leverages Diffusion Policy and Occupancy Grid Mapping to achieve robust, generalizable, and human-like scanning strategies with superior coverage and path efficiency across unseen objects and noisy environments.

Itsuki Hirako, Ryo Hakoda, Yubin Liu, Matthew Hwang, Yoshihiro Sato, Takeshi OishiThu, 12 Ma💻 cs

Safe Probabilistic Planning for Human-Robot Interaction using Conformal Risk Control

This paper presents a novel probabilistic safe control framework for human-robot interaction that integrates control barrier functions with conformal risk control to provide formal safety guarantees, dynamically adjust safety margins based on interaction context, and significantly reduce collision rates while maintaining task efficiency.

Jake Gonzales, Kazuki Mizuta, Karen Leung, Lillian J. RatliffThu, 12 Ma🤖 cs.AI

Shape Control of a Planar Hyper-Redundant Robot via Hybrid Kinematics-Informed and Learning-based Approach

This paper introduces SpatioCoupledNet, a hybrid kinematics-informed and learning-based control framework that effectively addresses the instability of flexible rack-actuated planar hyper-redundant robots by adaptively fusing physical priors with data-driven predictions to achieve superior shape control accuracy and convergence compared to existing methods.

Yuli Song, Wenbo Li, Wenci Xin, Zhiqiang Tang, Daniela Rus, Cecilia LaschiThu, 12 Ma💻 cs

Rethinking Gaussian Trajectory Predictors: Calibrated Uncertainty for Safe Planning

This paper proposes a novel calibration loss function that leverages Kernel Density Estimation to align predicted confidence levels with a Chi-squared distribution, thereby improving the reliability of Gaussian trajectory predictors and enhancing safety in uncertainty-aware autonomous planning.

Fatemeh Cheraghi Pouria, Mahsa Golchoubian, Katherine Driggs-CampbellThu, 12 Ma💻 cs

COHORT: Hybrid RL for Collaborative Large DNN Inference on Multi-Robot Systems Under Real-Time Constraints

This paper presents COHORT, a ROS-based collaborative framework for multi-robot systems that leverages a hybrid offline-online reinforcement learning strategy to dynamically distribute large DNN inference tasks, achieving significant improvements in battery efficiency, GPU utilization, and deadline compliance under real-time constraints.

Mohammad Saeid Anwar, Anuradha Ravi, Indrajeet Ghosh, Gaurav Shinde, Carl Busart, Nirmalya RoyThu, 12 Ma💻 cs

AsyncMDE: Real-Time Monocular Depth Estimation via Asynchronous Spatial Memory

AsyncMDE is an asynchronous real-time monocular depth estimation system that combines a high-quality foundation model running in the background with a lightweight, memory-fusing foreground model to achieve near-foundation accuracy at 237 FPS on high-end GPUs and 161 FPS on edge devices, effectively overcoming the computational barriers of deploying large models on resource-constrained platforms.

Lianjie Ma, Yuquan Li, Bingzheng Jiang, Ziming Zhong, Han Ding, Lijun ZhuThu, 12 Ma💻 cs

KnowDiffuser: A Knowledge-Guided Diffusion Planner with LM Reasoning and Prior-Informed Trajectory Initialization

KnowDiffuser is a knowledge-guided motion planning framework that integrates the semantic reasoning of language models with the generative capabilities of diffusion models to bridge the gap between high-level decision-making and physically feasible trajectory generation, achieving superior performance on the nuPlan benchmark.

Fan Ding, Xuewen Luo, Fengze Yang, Bo Yu, HwaHui Tew, Ganesh Krishnasamy, Junn Yong LooThu, 12 Ma💻 cs

DiT4DiT: Jointly Modeling Video Dynamics and Actions for Generalizable Robot Control

DiT4DiT is a novel end-to-end Video-Action Model that leverages intermediate denoising features from a video Diffusion Transformer to guide action prediction via a unified cascaded framework, achieving state-of-the-art performance and significantly improved sample efficiency in robot control tasks.

Teli Ma, Jia Zheng, Zifan Wang, Chuili Jiang, Andy Cui, Junwei Liang, Shuo YangThu, 12 Ma💻 cs

FAR-Dex: Few-shot Data Augmentation and Adaptive Residual Policy Refinement for Dexterous Manipulation

FAR-Dex is a hierarchical framework that combines few-shot data augmentation via the IsaacLab simulator with an adaptive residual policy refinement module to overcome data scarcity and high-dimensional action space challenges, achieving robust and precise dexterous arm-hand coordination with over 80% success in real-world tasks.

Yushan Bai, Fulin Chen, Hongzheng Sun, Yuchuang Tong, En Li, Zhengtao ZhangThu, 12 Ma🤖 cs.AI

SUBTA: A Framework for Supported User-Guided Bimanual Teleoperation in Structured Assembly

This paper introduces SUBTA, a supported teleoperation framework for bimanual assembly that integrates learned intention estimation, scene-graph task planning, and context-dependent motion assists, which a user study demonstrated significantly improves position and orientation accuracy while reducing mental demand compared to standard teleoperation.

Xiao Liu, Prakash Baskaran, Songpo Li, Simon Manschitz, Wei Ma, Dirk Ruiken, Soshi IbaThu, 12 Ma💻 cs

DepthCache: Depth-Guided Training-Free Visual Token Merging for Vision-Language-Action Model Inference

DepthCache is a training-free framework that accelerates Vision-Language-Action model inference by leveraging depth priors and temporal redundancy to selectively compress visual tokens, achieving significant speedups with minimal performance degradation while preserving critical spatial reasoning for robotic control.

Yuquan Li, Lianjie Ma, Han Ding, Lijun ZhuThu, 12 Ma💻 cs

Muscle Synergy Priors Enhance Biomechanical Fidelity in Predictive Musculoskeletal Locomotion Simulation

This paper introduces a physiology-informed reinforcement learning framework that utilizes low-dimensional muscle synergies as a control constraint to significantly enhance the biomechanical fidelity and generalization of predictive musculoskeletal simulations across diverse locomotion conditions.

Ilseung Park (Carnegie Mellon University), Eunsik Choi (Seoul National University), Jangwhan Ahn (UNC-Chapel Hill and NC State University), Jooeun Ahn (Seoul National University)Thu, 12 Ma🤖 cs.LG

BinWalker: Development and Field Evaluation of a Quadruped Manipulator Platform for Sustainable Litter Collection

This paper presents BinWalker, a quadruped robotic platform equipped with a manipulator arm and onboard container, designed to autonomously detect, grasp, and collect litter in challenging outdoor environments to support sustainable environmental cleanup efforts.

Giulio Turrisi, Angelo Bratta, Giovanni Minelli, Gabriel Fischer Abati, Amir H. Rad, João Carlos Virgolino Soares, Claudio SeminiThu, 12 Ma💻 cs

TacLoc: Global Tactile Localization on Objects from a Registration Perspective

TacLoc is a novel, efficient tactile localization framework that formulates pose estimation as a one-shot point cloud registration task using graph-theoretic partial-to-full matching and normal-guided pruning, achieving robust performance on real-world objects without relying on rendered data or pre-trained models.

Zirui Zhang, Boyang Zhang, Fumin Zhang, Huan YinThu, 12 Ma💻 cs

Safety-critical Control Under Partial Observability: Reach-Avoid POMDP meets Belief Space Control

This paper proposes a real-time, layered control architecture for safety-critical partially observable systems that decouples goal reaching, information gathering, and safety into modular components using learnable Belief Control Lyapunov Functions and conformal prediction-based Belief Control Barrier Functions, enabling efficient quadratic programming solutions that outperform existing solvers in both simulation and space-robotics experiments.

Matti Vahs, Joris Verhagen, Jana TumovaThu, 12 Ma💻 cs

Need for Speed: Zero-Shot Depth Completion with Single-Step Diffusion

The paper introduces Marigold-SSD, a single-step, late-fusion depth completion framework that leverages strong diffusion priors to achieve efficient, robust, and zero-shot 3D perception with significantly faster inference and minimal training costs compared to existing diffusion-based methods.

Jakub Gregorek, Paraskevas Pegios, Nando Metzger, Konrad Schindler, Theodora Kontogianni, Lazaros NalpantidisThu, 12 Ma💻 cs

Recover to Predict: Progressive Retrospective Learning for Variable-Length Trajectory Prediction

This paper proposes the Progressive Retrospective Framework (PRF), a plug-and-play method that utilizes a cascade of retrospective units and a rolling-start training strategy to effectively address the challenge of variable-length trajectory prediction in autonomous driving by progressively aligning features from incomplete observations with complete ones.

Hao Zhou, Lu Qi, Jason Li, Jie Zhang, Yi Liu, Xu Yang, Mingyu Fan, Fei LuoThu, 12 Ma🤖 cs.AI

← Previous Next →