cs.RO papers | Gist.Science

Learning Bimanual Cloth Manipulation with Vision-based Tactile Sensing via Single Robotic Arm

This paper introduces Touch G.O.G., a cost-effective single-arm framework utilizing a novel vision-based tactile gripper and deep learning models to achieve high-precision bimanual cloth manipulation, including reliable unfolding of crumpled fabrics, by overcoming the challenges of deformable object handling and occlusion.

Dongmyoung Lee, Wei Chen, Xiaoshuai Chen, Rui Zong, Petar KormushevThu, 12 Ma💻 cs

AdaClearGrasp: Learning Adaptive Clearing for Zero-Shot Robust Dexterous Grasping in Densely Cluttered Environments

AdaClearGrasp is a closed-loop framework that combines a vision-language model for adaptive decision-making between direct grasping and object clearing with a reinforcement learning policy for zero-shot dexterous manipulation, significantly improving success rates in densely cluttered environments.

Zixuan Chen, Wenquan Zhang, Jing Fang, Ruiming Zeng, Zhixuan Xu, Yiwen Hou, Xinke Wang, Jieqi Shi, Jing Huo, Yang GaoThu, 12 Ma💻 cs

Interleaving Scheduling and Motion Planning with Incremental Learning of Symbolic Space-Time Motion Abstractions

This paper proposes a novel framework that interleaves task scheduling and motion planning through an incremental learning loop, where symbolic feedback from motion feasibility checks guides the scheduler to generate efficient, collision-free plans for multi-object navigation in shared workspaces.

Elisa Tosello, Arthur Bit-Monnot, Davide Lusuardi, Alessandro Valentini, Andrea MicheliThu, 12 Ma🤖 cs.AI

STM32-Based Smart Waste Bin for Hygienic Disposal Using Embedded Sensing and Automated Control

This paper presents a low-cost, STM32-based smart waste bin that utilizes ultrasonic sensors and a servo motor to enable touch-free lid operation and real-time waste level monitoring, offering a hygienic and automated disposal solution for various environments.

Mohammed Aman Bhuiyan, Aritra Islam Saswato, Md. Misbah Khan, Anish Paul, Ahmed Faizul Haque Dhrubo, Mohammad Abdul QayumThu, 12 Ma💻 cs

Dynamic Modeling and Attitude Control of a Reaction-Wheel-Based Low-Gravity Bipedal Hopper

This paper presents a dynamic model and control strategy for an underactuated bipedal hopping robot that utilizes an internal reaction wheel to stabilize body posture during ballistic flight under low-gravity conditions, successfully reducing mid-air angular deviation by over 65% and ensuring precise upright landings in lunar gravity simulations.

Shriram Hari, M Venkata Sai Nikhil, R Prasanth KumarThu, 12 Ma⚡ eess

Cybo-Waiter: A Physical Agentic Framework for Humanoid Whole-Body Locomotion-Manipulation

The paper presents "Cybo-Waiter," a humanoid agent framework that converts natural language instructions into verifiable task programs and employs multi-object 3D geometric supervision to enable robust, long-horizon whole-body locomotion and manipulation in human environments.

Peng Ren, Haoyang Ge, Chuan Qi, Cong Huang, Hong Li, Jiang Zhao, Pei Chi, Kai ChenThu, 12 Ma💻 cs

OnFly: Onboard Zero-Shot Aerial Vision-Language Navigation toward Safety and Efficiency

OnFly is a fully onboard, real-time framework for zero-shot aerial vision-language navigation that employs a shared-perception dual-agent architecture, hybrid memory, and semantic-geometric verification to overcome existing limitations in decision stability and safety-efficiency trade-offs, achieving a task success rate of 67.8% in simulation.

Guiyong Zheng, Yueting Ban, Mingjie Zhang, Juepeng Zheng, Boyu ZhouThu, 12 Ma💻 cs

MapGCLR: Geospatial Contrastive Learning of Representations for Online Vectorized HD Map Construction

This paper proposes MapGCLR, a semi-supervised framework that enhances online vectorized HD map construction by enforcing geospatial consistency through contrastive learning on overlapping BEV feature grids, thereby improving performance with reduced reliance on labeled data.

Jonas Merkert, Alexander Blumberg, Jan-Hendrik Pauls, Christoph StillerThu, 12 Ma💻 cs

Parallel-in-Time Nonlinear Optimal Control via GPU-native Sequential Convex Programming

This paper presents a fully GPU-native trajectory optimization framework that leverages sequential convex programming and consensus-based ADMM with temporal splitting to achieve real-time, high-throughput nonlinear optimal control for autonomous systems, demonstrating significant speedups and energy efficiency over CPU baselines while enabling scalable multi-trajectory and robust Model Predictive Control.

Yilin Zou, Zhong Zhang, Fanghua JiangThu, 12 Ma⚡ eess

FutureVLA: Joint Visuomotor Prediction for Vision-Language-Action Model

FutureVLA is a novel framework that enhances Vision-Language-Action models by introducing a Joint Visuomotor Predictive Architecture with a gating mechanism to decouple visual state preservation from temporal action modeling, thereby enabling robots to effectively anticipate future states through temporally continuous and visually-conditioned joint embeddings.

Xiaoxu Xu, Hao Li, Jinhui Ye, Yilun Chen, Jia Zeng, Xinyi Chen, Linning Xu, Dahua Lin, Weixin Li, Jiangmiao PangThu, 12 Ma💻 cs

MAVEN: A Meta-Reinforcement Learning Framework for Varying-Dynamics Expertise in Agile Quadrotor Maneuvers

The paper introduces MAVEN, a meta-reinforcement learning framework that enables a single quadrotor policy to achieve robust, zero-shot sim-to-real agile navigation by using a predictive context encoder to instantly adapt to extreme dynamic variations, including up to 66.7% mass changes and 70% thrust loss, within less than an hour of training.

Jin Zhou, Dongcheng Cao, Xian Wang, Shuo LiThu, 12 Ma💻 cs

ASTER: Attitude-aware Suspended-payload Quadrotor Traversal via Efficient Reinforcement Learning

This paper introduces ASTER, a robust reinforcement learning framework that utilizes hybrid-dynamics-informed state seeding to overcome reward sparsity and achieve the first successful autonomous inverted flight for cable-suspended quadrotors with zero-shot sim-to-real transfer.

Dongcheng Cao, Jin Zhou, Shuo LiThu, 12 Ma💻 cs

Sublinear-Time Reconfiguration of Programmable Matter with Joint Movements

This paper resolves an open problem by demonstrating that centralized reconfiguration of geometric amoebot structures using joint movements can be achieved in sublinear time, specifically $O(\sqrt{n}\log n)$ rounds for transforming any structure into a canonical line segment and constant time for spiral-to-line conversions, without relying on auxiliary assumptions like metamodules.

Manish Kumar, Othon Michail, Andreas Padalkin, Christian ScheidelerThu, 12 Ma💻 cs

Semantic Landmark Particle Filter for Robot Localisation in Vineyards

This paper introduces a Semantic Landmark Particle Filter (SLPF) that enhances robot localisation in vineyards by integrating trunk and pole detections with LiDAR and GNSS to overcome perceptual aliasing caused by parallel crop rows, achieving significantly lower pose errors and improved row correctness compared to existing geometry-only, vision-based, and GNSS-only baselines.

Rajitha de Silva, Jonathan Cox, James R. Heselden, Marija Popovic, Cesar Cadena, Riccardo PolvaraThu, 12 Ma🤖 cs.AI

GRACE: A Unified 2D Multi-Robot Path Planning Simulator & Benchmark for Grid, Roadmap, And Continuous Environments

This paper introduces GRACE, a unified 2D simulator and benchmark that enables transparent, reproducible comparisons of multi-robot path planning algorithms across grid, roadmap, and continuous environments by standardizing task instantiation, execution, and evaluation protocols.

Chuanlong Zang, Anna Mannucci, Isabelle Barz, Philipp Schillinger, Florian Lier, Wolfgang HönigThu, 12 Ma🤖 cs.AI

FG-CLTP: Fine-Grained Contrastive Language Tactile Pretraining for Robotic Manipulation

This paper introduces FG-CLTP, a fine-grained contrastive pretraining framework that leverages a novel 100k-scale dataset of 3D tactile point clouds and quantitative tokenization to bridge the gap between tactile sensing and language, enabling a 3D tactile-language-action architecture that significantly outperforms existing methods in contact-rich robotic manipulation tasks.

Wenxuan Ma, Chaofan Zhang, Yinghao Cai, Guocai Yao, Shaowei Cui, Shuo WangThu, 12 Ma💻 cs

RL-Augmented MPC for Non-Gaited Legged and Hybrid Locomotion

This paper proposes a contact-explicit hierarchical architecture that combines Reinforcement Learning for high-level gait and navigation planning with low-level Model Predictive Control, successfully achieving robust zero-shot sim-to-sim and sim-to-real transfer across diverse legged and hybrid robotic platforms without domain randomization.

Andrea Patrizi, Carlo Rizzardo, Arturo Laurenzi, Francesco Ruscelli, Luca Rossini, Nikos G. TsagarakisThu, 12 Ma💻 cs

A gripper for flap separation and opening of sealed bags

This paper presents a novel robotic gripper featuring an active dented-roller fingertip and compliant fingers that reliably separates and opens sterile medical pouch flaps, offering a solution to automate a physically demanding and injury-prone manual task performed by hospital staff.

Sergi Foix, Jaume Oriol, Carme Torras, Júlia BorràsThu, 12 Ma⚡ eess

Lifelong Imitation Learning with Multimodal Latent Replay and Incremental Adjustment

This paper presents a lifelong imitation learning framework that utilizes multimodal latent replay and an incremental feature adjustment mechanism to achieve state-of-the-art performance on LIBERO benchmarks by significantly improving task adaptation while minimizing catastrophic forgetting.

Fanqi Yu, Matteo Tiezzi, Tommaso Apicella, Cigdem Beyan, Vittorio MurinoThu, 12 Ma💻 cs

STADA: Specification-based Testing for Autonomous Driving Agents

STADA is a specification-based testing framework that systematically generates diverse autonomous driving scenarios from formal temporal logic specifications, achieving significantly higher coverage with far fewer simulations compared to existing template-based or random generation methods.

Joy Saha, Trey Woodlief, Sebastian Elbaum, Matthew B. DwyerThu, 12 Ma💻 cs

← Previous Next →