Emerging Extrinsic Dexterity in Cluttered Scenes via Dynamics-aware Policy Learning

This paper introduces the Dynamics-Aware Policy Learning (DAPL) framework, which leverages explicit world modeling to learn contact-induced dynamics, enabling robots to achieve robust extrinsic dexterity in cluttered environments without hand-crafted heuristics and significantly outperforming existing manipulation methods in both simulation and real-world deployments.

Yixin Zheng, Jiangran Lyu, Yifan Zhang, Jiayi Chen, Mi Yan, Yuntian Deng, Xuesong Shi, Xiaoguang Zhao, Yizhou Wang, Zhizheng Zhang, He Wang2026-03-11🤖 cs.AI

MedMASLab: A Unified Orchestration Framework for Benchmarking Multimodal Medical Multi-Agent Systems

MedMASLab is a unified framework and benchmarking platform that addresses architectural fragmentation in medical multi-agent systems by introducing a standardized communication protocol, an automated zero-shot clinical reasoning evaluator, and an extensive multimodal benchmark spanning 473 diseases to reveal critical performance gaps in cross-specialty transitions.

Yunhang Qian, Xiaobin Hu, Jiaquan Yu, Siyang Xin, Xiaokun Chen, Jiangning Zhang, Peng-Tao Jiang, Jiawei Liu, Hongwei Bran Li2026-03-11🤖 cs.AI

Adaptive Clinical-Aware Latent Diffusion for Multimodal Brain Image Generation and Missing Modality Imputation

The paper introduces ACADiff, an adaptive clinical-aware latent diffusion framework that synthesizes missing multimodal brain imaging data (sMRI, FDG-PET, and AV45-PET) by integrating imaging observations with GPT-4o-encoded clinical metadata, achieving superior generation quality and robust diagnostic performance even when up to 80% of modalities are missing.

Rong Zhou, Houliang Zhou, Yao Su, Brian Y. Chen, Yu Zhang, Lifang He, Alzheimer's Disease Neuroimaging Initiative2026-03-11🤖 cs.AI

PathMem: Toward Cognition-Aligned Memory Transformation for Pathology MLLMs

PathMem is a memory-centric multimodal framework that enhances pathology large language models by organizing structured domain knowledge into long-term memory and utilizing a Memory Transformer to dynamically activate and ground this knowledge for improved diagnostic reasoning and report generation.

Jinyue Li, Yuci Liang, Qiankun Li, Xinheng Lyu, Jiayu Qian, Huabao Chen, Kun Wang, Zhigang Zeng, Anil Anthony Bharath, Yang Liu2026-03-11🤖 cs.AI

No Image, No Problem: End-to-End Multi-Task Cardiac Analysis from Undersampled k-Space

The paper proposes k-MTR, a novel framework that bypasses the traditional image reconstruction step by directly learning multi-task cardiac diagnostic features from undersampled k-space data through a shared semantic manifold, thereby eliminating reconstruction artifacts and achieving competitive performance across regression, classification, and segmentation tasks.

Yundi Zhang, Sevgi Gokce Kafali, Niklas Bubeck, Daniel Rueckert, Jiazhen Pan2026-03-11🤖 cs.AI

When Learning Rates Go Wrong: Early Structural Signals in PPO Actor-Critic

This paper introduces the Overfitting-Underfitting Indicator (OUI) as an efficient, early-stage metric based on hidden neuron activation patterns to distinguish optimal learning rates in PPO actor-critic training, demonstrating its superior ability to prune unpromising runs compared to traditional criteria by revealing distinct structural signatures in actor and critic networks.

Alberto Fernández-Hernández, Cristian Pérez-Corral, Jose I. Mestre, Manuel F. Dolz, Jose Duato, Enrique S. Quintana-Ortí2026-03-11🤖 cs.AI

Understanding the Use of a Large Language Model-Powered Guide to Make Virtual Reality Accessible for Blind and Low Vision People

This paper presents a study of a large language model-powered "sighted guide" for blind and low vision users in social virtual reality, revealing that participants adapt their interaction from a tool-based approach when alone to a companionable relationship in the presence of others, thereby offering key design recommendations for future accessible VR guides.

Jazmin Collins, Sharon Y Lin, Tianqi Liu, Andrea Stevenson Won, Shiri Azenkot2026-03-11🤖 cs.AI

From Data Statistics to Feature Geometry: How Correlations Shape Superposition

This paper challenges the standard view of superposition in neural networks by demonstrating that, unlike in idealized uncorrelated settings where interference is merely noise, realistic feature correlations allow models to arrange features so that interference becomes constructive, thereby naturally forming the semantic clusters and cyclical structures observed in real language models.

Lucas Prieto, Edward Stevinson, Melih Barsbey, Tolga Birdal, Pedro A. M. Mediano2026-03-11🤖 cs.AI