Beyond Dominant Patches: Spatial Credit Redistribution For Grounded Vision-Language Models

This paper introduces Spatial Credit Redistribution (SCR), a training-free inference-time method that mitigates hallucinations in Vision-Language Models by redistributing suppressed visual attention from dominant patches to their spatial neighbors, thereby significantly reducing hallucination rates across multiple benchmarks while preserving generation quality and maintaining negligible latency.

Niamul Hassan Samin, Md Arifur Rahman, Abdullah Ibne Hanif Arean + 2 more2026-03-05🤖 cs.AI

MoECLIP: Patch-Specialized Experts for Zero-shot Anomaly Detection

MoECLIP addresses the limitations of patch-agnostic designs in Zero-Shot Anomaly Detection by introducing a Mixture-of-Experts architecture that dynamically routes image patches to specialized LoRA experts, enhanced by Frozen Orthogonal Feature Separation and an ETF loss to ensure distinct and maximally equiangular representations, thereby achieving state-of-the-art performance across diverse industrial and medical benchmarks.

Jun Yeong Park, JunYoung Seo, Minji Kang + 1 more2026-03-05🤖 cs.AI

Beyond Accuracy: Evaluating Visual Grounding In Multimodal Medical Reasoning

This paper introduces a counterfactual evaluation framework revealing that while reinforcement learning with verifiable rewards improves accuracy on medical VQA benchmarks, it often degrades genuine visual grounding by enabling models to rely on text shortcuts and hallucinate visual reasoning, necessitating new evaluation metrics and training objectives that explicitly enforce visual dependence.

Anas Zafar, Leema Krishna Murali, Ashish Vashist2026-03-05💻 cs