World2Mind: Cognition Toolkit for Allocentric Spatial Reasoning in Foundation Models

World2Mind is a training-free toolkit that enhances foundation models' allocentric spatial reasoning by constructing structured cognitive maps and an Allocentric-Spatial Tree, enabling significant performance gains and even allowing text-only models to achieve complex 3D spatial reasoning comparable to advanced multimodal systems.

Shouwei Ruan, Bin Wang, Zhenyu Wu, Qihui Zhu, Yuxiang Zhang, Hang Su, Yubin Wang2026-03-11🤖 cs.AI

First Estimation of Model Parameters for Neutrino-Induced Nucleon Knockout Using Simulation-Based Inference

This paper demonstrates that simulation-based inference (SBI) is a viable and potentially superior alternative to traditional empirical tuning for determining neutrino interaction model parameters, as it successfully reproduces and slightly improves upon the MicroBooNE collaboration's tuned GENIE configuration while also approximating the NuWro simulation.

Karla Tame-Narvaez, Steven Gardiner, Aleksandra Ciprijanovic, Giuseppe Cerati2026-03-11⚛️ hep-ph

MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents

This paper introduces MA-EgoQA, a novel benchmark and dataset featuring 1,700 questions across five categories designed to evaluate the ability of AI models to understand and reason over multiple long-horizon egocentric videos from embodied agents, alongside a proposed baseline model named EgoMAS that highlights current limitations in system-level multi-agent understanding.

Kangsan Kim, Yanlai Yang, Suji Kim, Woongyeong Yeo, Youngwan Lee, Mengye Ren, Sung Ju Hwang2026-03-11🤖 cs.AI

Emerging Extrinsic Dexterity in Cluttered Scenes via Dynamics-aware Policy Learning

This paper introduces the Dynamics-Aware Policy Learning (DAPL) framework, which leverages explicit world modeling to learn contact-induced dynamics, enabling robots to achieve robust extrinsic dexterity in cluttered environments without hand-crafted heuristics and significantly outperforming existing manipulation methods in both simulation and real-world deployments.

Yixin Zheng, Jiangran Lyu, Yifan Zhang, Jiayi Chen, Mi Yan, Yuntian Deng, Xuesong Shi, Xiaoguang Zhao, Yizhou Wang, Zhizheng Zhang, He Wang2026-03-11🤖 cs.AI

MedMASLab: A Unified Orchestration Framework for Benchmarking Multimodal Medical Multi-Agent Systems

MedMASLab is a unified framework and benchmarking platform that addresses architectural fragmentation in medical multi-agent systems by introducing a standardized communication protocol, an automated zero-shot clinical reasoning evaluator, and an extensive multimodal benchmark spanning 473 diseases to reveal critical performance gaps in cross-specialty transitions.

Yunhang Qian, Xiaobin Hu, Jiaquan Yu, Siyang Xin, Xiaokun Chen, Jiangning Zhang, Peng-Tao Jiang, Jiawei Liu, Hongwei Bran Li2026-03-11🤖 cs.AI

Adaptive Clinical-Aware Latent Diffusion for Multimodal Brain Image Generation and Missing Modality Imputation

The paper introduces ACADiff, an adaptive clinical-aware latent diffusion framework that synthesizes missing multimodal brain imaging data (sMRI, FDG-PET, and AV45-PET) by integrating imaging observations with GPT-4o-encoded clinical metadata, achieving superior generation quality and robust diagnostic performance even when up to 80% of modalities are missing.

Rong Zhou, Houliang Zhou, Yao Su, Brian Y. Chen, Yu Zhang, Lifang He, Alzheimer's Disease Neuroimaging Initiative2026-03-11🤖 cs.AI

PathMem: Toward Cognition-Aligned Memory Transformation for Pathology MLLMs

PathMem is a memory-centric multimodal framework that enhances pathology large language models by organizing structured domain knowledge into long-term memory and utilizing a Memory Transformer to dynamically activate and ground this knowledge for improved diagnostic reasoning and report generation.

Jinyue Li, Yuci Liang, Qiankun Li, Xinheng Lyu, Jiayu Qian, Huabao Chen, Kun Wang, Zhigang Zeng, Anil Anthony Bharath, Yang Liu2026-03-11🤖 cs.AI

No Image, No Problem: End-to-End Multi-Task Cardiac Analysis from Undersampled k-Space

The paper proposes k-MTR, a novel framework that bypasses the traditional image reconstruction step by directly learning multi-task cardiac diagnostic features from undersampled k-space data through a shared semantic manifold, thereby eliminating reconstruction artifacts and achieving competitive performance across regression, classification, and segmentation tasks.

Yundi Zhang, Sevgi Gokce Kafali, Niklas Bubeck, Daniel Rueckert, Jiazhen Pan2026-03-11🤖 cs.AI