Unsupervised Deep Generative Models for Anomaly Detection in Neuroimaging: A Systematic Scoping Review

This systematic scoping review synthesizes thirty-three studies on unsupervised deep generative models for neuroimaging anomaly detection, highlighting their potential for pathology-agnostic localization in data-scarce settings while identifying key challenges such as methodological heterogeneity and limited external validation.

Youwan Mahé, Elise Bannier, Stéphanie Leplaideur, Elisa Fromont, Francesca Galassi2026-03-10💻 cs

Taming Modality Entanglement in Continual Audio-Visual Segmentation

This paper introduces the Continual Audio-Visual Segmentation (CAVS) task and proposes a Collision-based Multi-modal Rehearsal (CMR) framework that effectively addresses multi-modal semantic drift and co-occurrence confusion through novel sample selection and frequency adjustment strategies, significantly outperforming existing single-modal continual learning methods.

Yuyang Hong, Qi Yang, Tao Zhang, Zili Wang, Zhaojin Fu, Kun Ding, Bin Fan, Shiming Xiang2026-03-10💻 cs

PolyJailbreak: Cross-Modal Jailbreaking Attacks on Black-Box Multimodal LLMs

This paper introduces PolyJailbreak, a novel black-box framework that exploits multimodal safety asymmetries through a structured library of atomic strategies and reinforcement learning-based multi-agent optimization to achieve significantly higher jailbreak success rates on state-of-the-art multimodal large language models compared to existing methods.

Xinkai Wang, Beibei Li, Zerui Shao, Ao Liu, Guangquan Xu, Shouling Ji2026-03-10💻 cs

Khelte Khelte Shikhi: A Proposed HCI Framework for Gamified Interactive Learning with Minecraft in Bangladeshi Education Systems

This paper proposes a practical, three-tiered HCI framework for deploying localized, gamified Minecraft learning in Bangladesh's resource-constrained schools, addressing critical infrastructure gaps through adaptive offline and low-power solutions while outlining specific curriculum-aligned content and evaluation benchmarks for future pilot testing.

Mohd Ruhul Ameen, Akif Islam, Momen Khandokar Ope2026-03-10💻 cs

Rethinking Driving World Model as Synthetic Data Generator for Perception Tasks

This paper introduces Dream4Drive, a novel synthetic data generation framework that leverages 3D-aware guidance and a fine-tuned driving world model to create diverse, multi-view corner cases, effectively enhancing downstream perception tasks in autonomous driving without the performance gains being negated by increased training epochs.

Kai Zeng, Zhanqian Wu, Kaixin Xiong, Xiaobao Wei, Xiangyu Guo, Zhenxin Zhu, Kalok Ho, Lijun Zhou, Bohan Zeng, Ming Lu, Haiyang Sun, Bing Wang, Guang Chen, Hangjun Ye, Wentao Zhang2026-03-10💻 cs

LagMemo: Language 3D Gaussian Splatting Memory for Multi-modal Open-vocabulary Multi-goal Visual Navigation

The paper introduces LagMemo, a novel navigation system that utilizes a language-enhanced 3D Gaussian Splatting memory to enable efficient multi-modal, open-vocabulary, and multi-goal visual navigation, demonstrating superior performance over state-of-the-art methods on the newly curated GOAT-Core benchmark.

Haotian Zhou, Xiaole Wang, He Li, Zhuo Qi, Jinrun Yin, Haiyu Kong, Jianghuan Xu, Huijing Zhao2026-03-10💻 cs

MobiDock: Design and Control of A Modular Self Reconfigurable Bimanual Mobile Manipulator via Robotic Docking

This paper presents MobiDock, a modular self-reconfigurable bimanual mobile manipulator that utilizes an autonomous computer vision-based docking mechanism to physically unite two independent robots, thereby transforming complex multi-robot coordination into a simpler single-system control problem that significantly improves dynamic stability, precision, and operational efficiency.

Xuan-Thuan Nguyen, Khac Nam Nguyen, Ngoc Duy Tran, Thi Thoa Mac, Anh Nguyen, Hoang Hiep Ly, Tung D. Ta2026-03-10💻 cs

Detecting AI-Generated Images via Diffusion Snap-Back Reconstruction: A Forensic Approach

This paper proposes a forensic method called "diffusion snap-back reconstruction," which detects AI-generated images by analyzing how perceptual similarity metrics change when an image is perturbed and reconstructed by a diffusion model, achieving high accuracy (AUROC of 0.993) and robustness against common distortions without relying on traditional pixel-level artifacts.

Mohd Ruhul Ameen, Akif Islam2026-03-10💻 cs