GRAD-Former: Gated Robust Attention-based Differential Transformer for Change Detection

GRAD-Former is a novel, parameter-efficient framework for remote sensing change detection that utilizes a gated robust attention mechanism with Adaptive Feature Relevance and Refinement to overcome the limitations of existing models in handling high-resolution imagery and limited training data, achieving state-of-the-art performance across multiple datasets.

Durgesh Ameta, Ujjwal Mishra, Praful Hambarde + 1 more2026-03-03🤖 cs.AI

AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Large Vision-Language Models

This paper presents AgilePruner, an adaptive visual token pruning framework for Large Vision-Language Models that leverages empirical insights into the complementary strengths of attention-based and diversity-based methods to reduce computational overhead while mitigating hallucinations across varying image complexities.

Changwoo Baek, Jouwon Song, Sohyeon Kim + 1 more2026-03-03🤖 cs.LG

The MAMA-MIA Challenge: Advancing Generalizability and Fairness in Breast MRI Tumor Segmentation and Treatment Response Prediction

The MAMA-MIA Challenge establishes a large-scale, multi-institutional benchmark using US training and European test data to evaluate and improve the generalizability and fairness of AI models for breast MRI tumor segmentation and treatment response prediction across diverse demographic subgroups.

Lidia Garrucho, Smriti Joshi, Kaisar Kushibar + 43 more2026-03-03🤖 cs.AI

Certifiable Estimation with Factor Graphs

This paper presents a unified framework that synthesizes modular factor graph modeling with certifiable convex relaxation techniques by demonstrating that key mathematical transformations preserve factor graph structure, thereby enabling the use of existing, high-performance robotics libraries to implement globally optimal estimation without requiring specialized solver expertise.

Zhexin Xu, Nikolas R. Sanderson, Hanna Jiamei Zhang + 1 more2026-03-03💻 cs

When Does RL Help Medical VLMs? Disentangling Vision, SFT, and RL Gains

This paper presents a controlled study demonstrating that reinforcement learning primarily sharpens output distributions and improves sampling efficiency in medical Vision-Language Models only after supervised fine-tuning has established non-trivial reasoning support, leading to a boundary-aware training recipe that achieves strong performance across medical benchmarks.

Ahmadreza Jeddi, Kimia Shaban, Negin Baghbanzadeh + 4 more2026-03-03💻 cs

AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models

This paper presents AG-VAS, a novel zero-shot visual anomaly segmentation framework that leverages Large Multimodal Models enhanced with learnable semantic anchor tokens, a Semantic-Pixel Alignment Module, and a specialized instruction dataset to overcome limitations in abstract concept representation and achieve state-of-the-art localization performance across industrial and medical benchmarks.

Zhen Qu, Xian Tao, Xiaoyi Bao + 4 more2026-03-03🤖 cs.AI