VLM-RobustBench: A Comprehensive Benchmark for Robustness of Vision-Language Models

This paper introduces VLM-RobustBench, a comprehensive benchmark evaluating the robustness of four vision-language model families across 133 corruption settings, revealing that current models are semantically strong but spatially fragile, with low-severity geometric distortions causing significantly larger performance drops than visually severe photometric corruptions.

Rohit Saxena, Alessandro Suglia, Pasquale Minervini2026-03-09🤖 cs.AI

Ensemble Graph Neural Networks for Probabilistic Sea Surface Temperature Forecasting via Input Perturbations

This paper demonstrates that an ensemble of Graph Neural Networks for regional sea surface temperature forecasting, which introduces diversity through spatially coherent input perturbations like Perlin noise rather than model retraining, achieves well-calibrated probabilistic forecasts with improved uncertainty representation at no additional training cost.

Alejandro J. González-Santana, Giovanny A. Cuervo-Londoño, Javier Sánchez2026-03-09🤖 cs.AI

Contrastive-to-Self-Supervised: A Two-Stage Framework for Script Similarity Learning

This paper proposes a two-stage framework that first trains a contrastive encoder on labeled invented alphabets and then uses teacher-student distillation to learn unsupervised, deformation-invariant embeddings for historically attested scripts, effectively bridging supervised discriminative learning with unsupervised discovery of latent cross-script similarities without requiring ground-truth evolutionary relationships.

Claire Roman, Philippe Meyer2026-03-09🤖 cs.AI

CRIMSON: A Clinically-Grounded LLM-Based Metric for Generative Radiology Report Evaluation

This paper introduces CRIMSON, a clinically grounded evaluation framework for chest X-ray report generation that leverages patient context, guideline-based severity weighting, and a comprehensive error taxonomy to achieve superior alignment with radiologist judgments compared to existing metrics.

Mohammed Baharoon, Thibault Heintz, Siavash Raissi, Mahmoud Alabbad, Mona Alhammad, Hassan AlOmaish, Sung Eun Kim, Oishi Banerjee, Pranav Rajpurkar2026-03-09🤖 cs.AI

Whisper-CD: Accurate Long-Form Speech Recognition using Multi-Negative Contrastive Decoding

Whisper-CD is a training-free, inference-time contrastive decoding framework that mitigates hallucinations and repetition in long-form speech recognition by contrasting clean audio logits against a unified objective derived from multiple acoustically motivated negative perturbations, thereby significantly reducing word error rates and improving generation throughput without requiring model retraining.

Hoseong Ahn, Jeongyun Chae, Yoonji Park, Kyuhong Shim2026-03-09🤖 cs.AI

FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling

FlashPrefill is a novel framework that achieves ultra-fast long-context prefilling by combining instantaneous block-searching for dynamic sparse patterns with a thresholding mechanism to eliminate long-tail attention scores, delivering up to a 27.78x speedup on 256K sequences while maintaining efficiency on shorter contexts.

Qihang Fan, Huaibo Huang, Zhiying Wu, Juqiu Wang, Bingning Wang, Ran He2026-03-09🤖 cs.AI

Conversational Demand Response: Bidirectional Aggregator-Prosumer Coordination through Agentic AI

This paper introduces Conversational Demand Response (CDR), a bidirectional coordination framework leveraging agentic AI to enable natural language interactions between aggregators and prosumers, thereby combining automated scalability with enhanced user transparency and agency to sustain residential demand response participation.

Reda El Makroum, Sebastian Zwickl-Bernhard, Lukas Kranzl, Hans Auer2026-03-09🤖 cs.AI

TaPD: Temporal-adaptive Progressive Distillation for Observation-Adaptive Trajectory Forecasting in Autonomous Driving

TaPD is a unified, plug-and-play framework that employs temporal-adaptive progressive distillation and a temporal backfilling module to enable robust trajectory forecasting under variable and extremely short observation histories by transferring knowledge from long-horizon teachers and reconstructing missing past context.

Mingyu Fan, Yi Liu, Hao Zhou, Deheng Qian, Mohammad Haziq Khan, Matthias Raetsch2026-03-09🤖 cs.AI

GazeMoE: Perception of Gaze Target with Mixture-of-Experts

GazeMoE is a novel end-to-end framework that leverages Mixture-of-Experts modules to adaptively fuse multi-modal cues from a frozen vision foundation model, achieving state-of-the-art performance in human gaze target estimation by addressing class imbalance and enhancing robustness through specialized loss functions and data augmentation.

Zhuangzhuang Dai, Zhongxi Lu, Vincent G. Zakka, Luis J. Manso, Jose M Alcaraz Calero, Chen Li2026-03-09🤖 cs.AI

Agentic retrieval-augmented reasoning reshapes collective reliability under model variability in radiology question answering

This study demonstrates that agentic retrieval-augmented reasoning pipelines significantly enhance the collective reliability, consensus strength, and cross-model robustness of large language models in radiology question answering compared to zero-shot inference, while highlighting that accuracy and agreement alone are insufficient metrics for evaluating clinical safety under model variability.

Mina Farajiamiri, Jeta Sopa, Saba Afza, Lisa Adams, Felix Barajas Ordonez, Tri-Thien Nguyen, Mahshad Lotfinia, Sebastian Wind, Keno Bressem, Sven Nebelung, Daniel Truhn, Soroosh Tayebi Arasteh2026-03-09🤖 cs.AI

Stem: Rethinking Causal Information Flow in Sparse Attention

This paper introduces Stem, a novel plug-and-play sparse attention module that overcomes the quadratic complexity bottleneck in long-context LLMs by aligning sparsity with causal information flow through position-dependent token retention and an output-aware metric, thereby achieving superior accuracy with reduced computational cost and latency.

Lin Niu, Xin Luo, Linchuan Xie, Yifu Sun, Guanghua Yu, Jianchen Zhu, S Kevin Zhou2026-03-09🤖 cs.AI

Artificial Intelligence for Climate Adaptation: Reinforcement Learning for Climate Change-Resilient Transport

This paper proposes a novel reinforcement learning-based decision-support framework that outperforms traditional optimization methods by discovering coordinated, long-term adaptation pathways for urban transport systems to effectively balance investment costs against climate-induced flood risks under deep uncertainty, as demonstrated in a case study of Copenhagen.

Miguel Costa, Arthur Vandervoort, Carolin Schmidt, João Miranda, Morten W. Petersen, Martin Drews, Karyn Morrisey, Francisco C. Pereira2026-03-09🤖 cs.AI