World Model for Battery Degradation Prediction Under Non-Stationary Aging

This paper proposes a world model framework for lithium-ion battery degradation prognosis that encodes cycle data into latent states and propagates them forward using learned dynamics, demonstrating that iterative rollout significantly reduces trajectory forecast error compared to direct regression while a Single Particle Model constraint specifically enhances prediction accuracy at the degradation knee.

Kai Chin Lim, Khay Wai See2026-03-12⚡ eess

UAV-MARL: Multi-Agent Reinforcement Learning for Time-Critical and Dynamic Medical Supply Delivery

This paper presents a Multi-Agent Reinforcement Learning framework using Proximal Policy Optimization to coordinate UAV fleets for time-critical medical supply delivery, demonstrating that classical PPO outperforms asynchronous and sequential strategies in dynamically prioritizing tasks and reallocating resources under uncertain conditions using real-world geographic data.

Islam Guven, Mehmet Parlak2026-03-12🤖 cs.LG

Tackling Length Inflation Without Trade-offs: Group Relative Reward Rescaling for Reinforcement Learning

This paper introduces Group Relative Reward Rescaling (GR3^3), a novel reinforcement learning method that effectively mitigates length inflation in large language models by reframing length control as a multiplicative rescaling paradigm, thereby achieving lossless optimization and superior performance compared to existing baselines without compromising downstream capabilities.

Zichao Li, Jie Lou, Fangchen Dong, Zhiyuan Fan, Mengjie Ren, Hongyu Lin, Xianpei Han, Debing Zhang, Le Sun, Yaojie Lu, Xing Yu2026-03-12🤖 cs.LG

Implicit Statistical Inference in Transformers: Approximating Likelihood-Ratio Tests In-Context

This paper demonstrates that Transformers performing in-context learning on binary hypothesis testing tasks effectively approximate Bayes-optimal statistical estimators by dynamically adapting their internal decision mechanisms—ranging from voting-style ensembles for linear tasks to deeper sequential computations for nonlinear ones—rather than relying on simple similarity matching or fixed heuristics.

Faris Chaudhry, Siddhant Gadkari2026-03-12🤖 cs.LG

Does LLM Alignment Really Need Diversity? An Empirical Study of Adapting RLVR Methods for Moral Reasoning

This paper empirically demonstrates that contrary to the hypothesis that moral reasoning alignment requires diversity-seeking algorithms, standard reward-maximizing RLVR methods are equally or more effective because high-reward moral responses exhibit a concentrated distribution in semantic space similar to logical reasoning tasks.

Zhaowei Zhang, Xiaohan Liu, Xuekai Zhu, Junchao Huang, Ceyao Zhang, Zhiyuan Feng, Yaodong Yang, Xiaoyuan Yi, Xing Xie2026-03-12🤖 cs.AI

Gradient Flow Drifting: Generative Modeling via Wasserstein Gradient Flows of KDE-Approximated Divergences

This paper establishes a mathematical framework called Gradient Flow Drifting that proves the equivalence between the recently proposed Drifting Model and the Wasserstein gradient flow of the forward KL divergence under KDE approximation, while extending the approach to a mixed-divergence strategy on Riemannian manifolds to simultaneously mitigate mode collapse and blurring.

Jiarui Cao, Zixuan Wei, Yuxin Liu2026-03-12🤖 cs.LG

Geo-ATBench: A Benchmark for Geospatial Audio Tagging with Geospatial Semantic Context

This paper introduces Geo-ATBench, a new benchmark and the Geo-AT task that leverage geospatial semantic context to resolve acoustic ambiguities in multi-label audio tagging, demonstrating through the GeoFusion-AT framework that incorporating location-based priors significantly improves recognition performance and aligns with human judgment.

Yuanbo Hou, Yanru Wu, Qiaoqiao Ren, Shengchen Li, Stephen Roberts, Dick Botteldooren2026-03-12⚡ eess

Surrogate models for nuclear fusion with parametric Shallow Recurrent Decoder Networks: applications to magnetohydrodynamics

This paper demonstrates that a data-driven framework combining Singular Value Decomposition with Shallow Recurrent Decoder (SHRED) networks can accurately and efficiently reconstruct full spatio-temporal magnetohydrodynamic states from sparse temperature sensor measurements, offering a robust surrogate model for real-time monitoring and control in nuclear fusion applications.

M. Lo Verso, C. Introini, E. Cervi, L. Savoldi, J. N. Kutz, A. Cammi2026-03-12🤖 cs.LG