cs.AI papers | Gist.Science

Predictive Coding Graphs are a Superset of Feedforward Neural Networks

This paper demonstrates that predictive coding graphs constitute a mathematical superset of feedforward neural networks, thereby strengthening their theoretical foundation in machine learning and highlighting the importance of network topology.

Björn van Zwol2026-03-09🤖 cs.AI

VLM-RobustBench: A Comprehensive Benchmark for Robustness of Vision-Language Models

This paper introduces VLM-RobustBench, a comprehensive benchmark evaluating the robustness of four vision-language model families across 133 corruption settings, revealing that current models are semantically strong but spatially fragile, with low-severity geometric distortions causing significantly larger performance drops than visually severe photometric corruptions.

Rohit Saxena, Alessandro Suglia, Pasquale Minervini2026-03-09🤖 cs.AI

Ensemble Graph Neural Networks for Probabilistic Sea Surface Temperature Forecasting via Input Perturbations

This paper demonstrates that an ensemble of Graph Neural Networks for regional sea surface temperature forecasting, which introduces diversity through spatially coherent input perturbations like Perlin noise rather than model retraining, achieves well-calibrated probabilistic forecasts with improved uncertainty representation at no additional training cost.

Alejandro J. González-Santana, Giovanny A. Cuervo-Londoño, Javier Sánchez2026-03-09🤖 cs.AI

Do Compact SSL Backbones Matter for Audio Deepfake Detection? A Controlled Study with RAPTOR

This paper introduces RAPTOR, a controlled study demonstrating that multilingual HuBERT pre-training, rather than model scale, is the primary driver of cross-domain robustness and reliable calibration in compact audio deepfake detection systems.

Ajinkya Kulkarni, Sandipana Dowerah, Atharva Kulkarni, Tanel Alumäe, Mathew Magimai Doss2026-03-09🤖 cs.AI

Reflective Flow Sampling Enhancement

This paper introduces Reflective Flow Sampling (RF-Sampling), a training-free, theoretically-grounded inference framework that significantly enhances text-prompt alignment and generation quality for flow-based models like FLUX by implicitly performing gradient ascent on alignment scores through flow inversion and textual representation integration.

Zikai Zhou, Muyao Wang, Shitong Shao, Lichen Bai, Haoyi Xiong, Bo Han, Zeke Xie2026-03-09🤖 cs.AI

Contrastive-to-Self-Supervised: A Two-Stage Framework for Script Similarity Learning

This paper proposes a two-stage framework that first trains a contrastive encoder on labeled invented alphabets and then uses teacher-student distillation to learn unsupervised, deformation-invariant embeddings for historically attested scripts, effectively bridging supervised discriminative learning with unsupervised discovery of latent cross-script similarities without requiring ground-truth evolutionary relationships.

Claire Roman, Philippe Meyer2026-03-09🤖 cs.AI

CRIMSON: A Clinically-Grounded LLM-Based Metric for Generative Radiology Report Evaluation

This paper introduces CRIMSON, a clinically grounded evaluation framework for chest X-ray report generation that leverages patient context, guideline-based severity weighting, and a comprehensive error taxonomy to achieve superior alignment with radiologist judgments compared to existing metrics.

Mohammed Baharoon, Thibault Heintz, Siavash Raissi, Mahmoud Alabbad, Mona Alhammad, Hassan AlOmaish, Sung Eun Kim, Oishi Banerjee, Pranav Rajpurkar2026-03-09🤖 cs.AI

Whisper-CD: Accurate Long-Form Speech Recognition using Multi-Negative Contrastive Decoding

Whisper-CD is a training-free, inference-time contrastive decoding framework that mitigates hallucinations and repetition in long-form speech recognition by contrasting clean audio logits against a unified objective derived from multiple acoustically motivated negative perturbations, thereby significantly reducing word error rates and improving generation throughput without requiring model retraining.

Hoseong Ahn, Jeongyun Chae, Yoonji Park, Kyuhong Shim2026-03-09🤖 cs.AI

MAPO: Mixed Advantage Policy Optimization for Long-Horizon Multi-Turn Dialogue

The paper proposes MAPO, a critic-free reinforcement learning algorithm that combines dense process feedback from a judge model with a mixed advantage estimator to enable stable, scalable, and high-performing long-horizon multi-turn dialogue optimization in subjective tasks like emotional support.

Naifan Zhang, Ruihan Sun, Jinwei Su, Hengjie Yang, Zhengyuan Pan, Zhaohan Chen, Xiaofan Zhang2026-03-09🤖 cs.AI

FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling

FlashPrefill is a novel framework that achieves ultra-fast long-context prefilling by combining instantaneous block-searching for dynamic sparse patterns with a thresholding mechanism to eliminate long-tail attention scores, delivering up to a 27.78x speedup on 256K sequences while maintaining efficiency on shorter contexts.

Qihang Fan, Huaibo Huang, Zhiying Wu, Juqiu Wang, Bingning Wang, Ran He2026-03-09🤖 cs.AI

Cut to the Chase: Training-free Multimodal Summarization via Chain-of-Events

The paper introduces CoE, a training-free multimodal summarization framework that leverages a Hierarchical Event Graph to guide a Chain-of-Events reasoning process, effectively addressing limitations in cross-modal grounding and temporal modeling while achieving state-of-the-art performance across diverse datasets.

Xiaoxing You, Qiang Huang, Lingyu Li, Xiaojun Chang, Jun Yu2026-03-09🤖 cs.AI

Conversational Demand Response: Bidirectional Aggregator-Prosumer Coordination through Agentic AI

This paper introduces Conversational Demand Response (CDR), a bidirectional coordination framework leveraging agentic AI to enable natural language interactions between aggregators and prosumers, thereby combining automated scalability with enhanced user transparency and agency to sustain residential demand response participation.

Reda El Makroum, Sebastian Zwickl-Bernhard, Lukas Kranzl, Hans Auer2026-03-09🤖 cs.AI

TaPD: Temporal-adaptive Progressive Distillation for Observation-Adaptive Trajectory Forecasting in Autonomous Driving

TaPD is a unified, plug-and-play framework that employs temporal-adaptive progressive distillation and a temporal backfilling module to enable robust trajectory forecasting under variable and extremely short observation histories by transferring knowledge from long-horizon teachers and reconstructing missing past context.

Mingyu Fan, Yi Liu, Hao Zhou, Deheng Qian, Mohammad Haziq Khan, Matthias Raetsch2026-03-09🤖 cs.AI

GazeMoE: Perception of Gaze Target with Mixture-of-Experts

GazeMoE is a novel end-to-end framework that leverages Mixture-of-Experts modules to adaptively fuse multi-modal cues from a frozen vision foundation model, achieving state-of-the-art performance in human gaze target estimation by addressing class imbalance and enhancing robustness through specialized loss functions and data augmentation.

Zhuangzhuang Dai, Zhongxi Lu, Vincent G. Zakka, Luis J. Manso, Jose M Alcaraz Calero, Chen Li2026-03-09🤖 cs.AI

Learning to Solve Orienteering Problem with Time Windows and Variable Profits

This paper proposes DeCoST, a learning-based two-stage framework that effectively decouples discrete and continuous variables to solve the Orienteering Problem with Time Windows and Variable Profits, achieving superior solution quality and significant inference speedups compared to state-of-the-art methods.

Songqun Gao, Zanxi Ruan, Patrick Floor, Marco Roveri, Luigi Palopoli, Daniele Fontanelli2026-03-09🤖 cs.AI

HiPP-Prune: Hierarchical Preference-Conditioned Structured Pruning for Vision-Language Models

HiPP-Prune is a hierarchical preference-conditioned structured pruning framework for vision-language models that leverages visual sensitivity signals and multi-objective Group Relative Policy Optimization to generate controllable pruning plans, effectively balancing task utility, compression, and hallucination robustness.

Lincen Bai, Hedi Tabia, Raul Santos-Rodriguez2026-03-09🤖 cs.AI

Agentic retrieval-augmented reasoning reshapes collective reliability under model variability in radiology question answering

This study demonstrates that agentic retrieval-augmented reasoning pipelines significantly enhance the collective reliability, consensus strength, and cross-model robustness of large language models in radiology question answering compared to zero-shot inference, while highlighting that accuracy and agreement alone are insufficient metrics for evaluating clinical safety under model variability.

Mina Farajiamiri, Jeta Sopa, Saba Afza, Lisa Adams, Felix Barajas Ordonez, Tri-Thien Nguyen, Mahshad Lotfinia, Sebastian Wind, Keno Bressem, Sven Nebelung, Daniel Truhn, Soroosh Tayebi Arasteh2026-03-09🤖 cs.AI

Looking Through Glass Box

This paper presents a neural network implementation of Fuzzy Cognitive Maps (FHM) that utilizes Langevin differential dynamics to learn causality patterns, derive inverse solutions for output modification, and evaluates its performance across multiple datasets.

Alexis Kafantaris2026-03-09🤖 cs.AI

Stem: Rethinking Causal Information Flow in Sparse Attention

This paper introduces Stem, a novel plug-and-play sparse attention module that overcomes the quadratic complexity bottleneck in long-context LLMs by aligning sparsity with causal information flow through position-dependent token retention and an output-aware metric, thereby achieving superior accuracy with reduced computational cost and latency.

Lin Niu, Xin Luo, Linchuan Xie, Yifu Sun, Guanghua Yu, Jianchen Zhu, S Kevin Zhou2026-03-09🤖 cs.AI

Artificial Intelligence for Climate Adaptation: Reinforcement Learning for Climate Change-Resilient Transport

This paper proposes a novel reinforcement learning-based decision-support framework that outperforms traditional optimization methods by discovering coordinated, long-term adaptation pathways for urban transport systems to effectively balance investment costs against climate-induced flood risks under deep uncertainty, as demonstrated in a case study of Copenhagen.

Miguel Costa, Arthur Vandervoort, Carolin Schmidt, João Miranda, Morten W. Petersen, Martin Drews, Karyn Morrisey, Francisco C. Pereira2026-03-09🤖 cs.AI

← Previous Next →