cs.LG papers | Gist.Science

How Far Can Unsupervised RLVR Scale LLM Training?

This paper provides a comprehensive theoretical and empirical analysis of unsupervised reinforcement learning with verifiable rewards (URLVR), revealing that intrinsic reward methods are fundamentally limited by a confidence-correctness alignment ceiling that causes model collapse, while suggesting that external rewards grounded in computational asymmetries may offer a scalable alternative.

Bingxiang He, Yuxin Zuo, Zeyuan Liu, Shangziqi Zhao, Zixuan Fu, Junlin Yang, Cheng Qian, Kaiyan Zhang, Yuchen Fan, Ganqu Cui, Xiusi Chen, Youbang Sun, Xingtai Lv, Xuekai Zhu, Li Sheng, Ran Li, Huan-ang Gao, Yuchen Zhang, Bowen Zhou, Zhiyuan Liu, Ning Ding2026-03-10🤖 cs.LG

Characterization and upgrade of a quantum graph neural network for charged particle tracking

This paper characterizes and upgrades a hybrid quantum-classical graph neural network for charged particle track reconstruction in high-luminosity LHC environments, demonstrating improved training convergence and behavior through the integration of parametrized quantum circuits.

Matteo Argenton, Laura Cappelli, Concezio Bozzi2026-03-10⚛️ quant-ph

Momentum SVGD-EM for Accelerated Maximum Marginal Likelihood Estimation

This paper introduces Momentum SVGD-EM, an accelerated algorithm for Maximum Marginal Likelihood Estimation that integrates Nesterov momentum into both parameter updates and probability measure optimization via Stein variational gradient descent, demonstrating consistently faster convergence across diverse low- and high-dimensional tasks.

Adam Rozzio, Rafael Athanasiades, O. Deniz Akyildiz2026-03-10🤖 cs.LG

A New Lower Bound for the Random Offerer Mechanism in Bilateral Trade using AI-Guided Evolutionary Search

This paper employs an AI-guided evolutionary search framework to identify a new worst-case distribution that establishes a lower bound of 2.0749 for the approximation ratio of the Random-Offerer mechanism in bilateral trade, surpassing previous conjectures and known counterexamples.

Yang Cai, Vineet Gupta, Zun Li, Aranyak Mehta2026-03-10🤖 cs.LG

Structural Causal Bottleneck Models

This paper introduces Structural Causal Bottleneck Models (SCBMs), a novel framework that assumes causal effects between high-dimensional variables depend only on low-dimensional summary statistics, offering a flexible and estimable approach for task-specific dimension reduction and improved effect estimation in low-sample transfer learning settings.

Simon Bing, Jonas Wahl, Jakob Runge2026-03-10🤖 cs.LG

Benchmarking Language Modeling for Lossless Compression of Full-Fidelity Audio

This paper introduces Trilobyte, a byte-level tokenization schema that enables tractable lossless compression of full-fidelity (up to 24-bit) audio using autoregressive language models, demonstrating that while these models outperform FLAC at lower bit depths, their compression gains diminish as bit depth increases.

Phillip Long, Zachary Novack, Chris Donahue2026-03-10🤖 cs.LG

Split Federated Learning Architectures for High-Accuracy and Low-Delay Model Training

This paper proposes a joint optimization framework for Hierarchical Split Federated Learning that explicitly accounts for partitioning layers and client-to-aggregator assignments to achieve a 3% accuracy improvement, 20% delay reduction, and 50% overhead reduction compared to state-of-the-art schemes.

Yiannis Papageorgiou, Yannis Thomas, Ramin Khalili, Iordanis Koutsopoulos2026-03-10🤖 cs.LG

Agentic Critical Training

The paper proposes Agentic Critical Training (ACT), a reinforcement learning paradigm that enhances large language model agents by rewarding their ability to autonomously judge the quality of actions among alternatives, thereby fostering genuine self-reflection and outperforming traditional imitation learning and knowledge distillation methods across various benchmarks.

Weize Liu, Minghui Liu, Sy-Tuyen Ho, Souradip Chakraborty, Xiyao Wang, Furong Huang2026-03-10🤖 cs.LG

Impermanent: A Live Benchmark for Temporal Generalization in Time Series Forecasting

This paper introduces Impermanent, a live benchmark that evaluates time-series forecasting models on continuously updated GitHub activity data to assess their robustness against temporal shifts and distributional changes, addressing the limitations of static evaluation protocols that risk data contamination and inflated performance claims.

Azul Garza, Renée Rosillo, Rodrigo Mendoza-Smith, David Salinas, Andrew Robert Williams, Arjun Ashok, Mononito Goswami, José Martín Juárez2026-03-10🤖 cs.LG

A mixed-frequency approach for exchange rates predictions

This paper proposes a mixed-frequency modeling approach to overcome the information loss inherent in temporal aggregation, demonstrating its effectiveness in predicting CAD/USD exchange rates and addressing the Meese and Rogoff puzzle.

Raffaele Mattera, Michelangelo Misuraca, Germana Scepi, Maria Spano2026-03-09🤖 cs.LG

A Cognitive Explainer for Fetal ultrasound images classifier Based on Medical Concepts

This paper proposes an interpretable framework that leverages a concept-based graph convolutional neural network to incorporate medical prior knowledge, thereby providing clinicians with transparent, cognition-aligned explanations for fetal ultrasound scan plane detection.

Yingni Wanga, Yunxiao Liua, Licong Dongc, Xuzhou Wua, Huabin Zhangb, Qiongyu Yed, Desheng Sunc, Xiaobo Zhoue, Kehong Yuan2026-03-09🤖 cs.AI

Correlations Between COVID-19 and Dengue

This paper presents a neural network-based correlation model demonstrating similar trends between COVID-19 and Dengue cases, which is extended into an LSTM framework to predict Dengue infections in regions with insufficient data by leveraging COVID-19 statistics and external factors.

Paula Bergero, Laura P. Schaposnik, Grace Wang2026-03-09🧬 q-bio

Graph Neural Networks on Factor Graphs for Robust, Fast, and Scalable Linear State Estimation with PMUs

This paper proposes a scalable and robust Graph Neural Network (GNN) framework operating on augmented factor graphs to achieve fast, accurate linear state estimation in power systems using PMU data, effectively mitigating the impact of measurement failures while maintaining linear computational complexity.

Ognjen Kundacina, Mirsad Cosovic, Dragisa Miskovic + 1 more2026-03-09⚡ eess

Expert-Aided Causal Discovery of Ancestral Graphs

This paper introduces Ancestral GFlowNet (AGFN), a diversity-seeking reinforcement learning algorithm that enables distributional inference over ancestral graphs by iteratively refining its policy through Bayesian aggregation of both ex-ante and uncertain ex-post expert feedback, ultimately converging to the true causal structure even when expert responses are noisy or conflicting.

Tiago da Silva, Bruna Bazaluk, Eliezer de Souza da Silva, António Góis, Salem Lahlou, Dominik Heider, Samuel Kaski, Diego Mesquita, Adèle Helena Ribeiro2026-03-09🤖 cs.LG

A unified framework for learning with nonlinear model classes from arbitrary linear samples

This paper introduces a unified framework for learning unknown objects from arbitrary linear samples using general nonlinear model classes, establishing near-optimal generalization bounds based on the model's variation and complexity while recovering and extending existing results in areas like compressed sensing and matrix sketching.

Ben Adcock, Juan M. Cardenas, Nick Dexter2026-03-09🤖 cs.LG

Estimation of Energy-dissipation Lower-bounds for Neuromorphic Learning-in-memory

This paper derives model-agnostic theoretical lower-bounds for the energy-to-solution metric of ideal neuromorphic learning-in-memory optimizers by analyzing their out-of-equilibrium thermodynamics, demonstrating how matching memory dynamics to optimization processes can overcome energy bottlenecks associated with memory writes and consolidation in large-scale AI workloads.

Zihao Chen, Faiek Ahsan, Johannes Leugering, Gert Cauwenberghs, Shantanu Chakrabartty2026-03-09🤖 cs.AI

Make VLM Recognize Visual Hallucination on Cartoon Character Image with Pose Information

This paper proposes a pose-aware in-context visual learning (PA-ICVL) framework that enhances Vision-Language Models' ability to detect semantic structural visual hallucinations in non-photorealistic cartoon images by integrating pose information alongside RGB data, achieving significant performance improvements over RGB-only baselines.

Bumsoo Kim, Wonseop Shin, Kyuchul Lee, Yonghoon Jung, Sanghyun Seo2026-03-09🤖 cs.AI

BInD: Bond and Interaction-generating Diffusion Model for Multi-objective Structure-based Drug Design

BInD is a knowledge-guided diffusion model that co-generates molecules and their interactions with target proteins to achieve balanced multi-objective optimization in structure-based drug design, outperforming existing methods across key criteria like binding affinity, molecular properties, and local geometry.

Joongwon Lee, Wonho Zhung, Jisu Seo, Woo Youn Kim2026-03-09🤖 cs.LG

Combining X-Vectors and Bayesian Batch Active Learning: Two-Stage Active Learning Pipeline for Speech Recognition

This paper proposes a novel two-stage active learning pipeline for automatic speech recognition that combines unsupervised x-vector clustering with a supervised Bayesian batch selection method to efficiently identify diverse and informative samples, thereby significantly reducing labeling effort while improving model performance across various test conditions.

Ognjen Kundacina, Vladimir Vincan, Dragisa Miskovic2026-03-09⚡ eess

Predictive Coding Networks and Inference Learning: Tutorial and Survey

This paper provides a comprehensive review and formal specification of Predictive Coding Networks (PCNs), highlighting their biological plausibility, computational efficiency through inference learning, and versatility as a probabilistic framework that extends beyond traditional backpropagation-based neural networks.

Björn van Zwol, Ro Jefferson, Egon L. van den Broek2026-03-09🤖 cs.AI

← Previous Next →