cs.LG papers | Gist.Science

HSG-12M: A Large-Scale Benchmark of Spatial Multigraphs from the Energy Spectra of Non-Hermitian Crystals

This paper introduces Poly2Graph, an automated pipeline for generating HSG-12M, a pioneering 16.7-million-scale dataset of spatial multigraphs derived from non-Hermitian crystal energy spectra, which bridges condensed matter physics and geometry-aware graph learning by preserving vital geometric information often discarded in existing benchmarks.

Xianquan Yan, Hakan Akgün, Kenji Kawaguchi + 2 more2026-03-06🔬 cond-mat.mes-hall

EDINET-Bench: Evaluating LLMs on Complex Financial Tasks using Japanese Financial Statements

This paper introduces EDINET-Bench, a challenging open-source benchmark derived from ten years of Japanese financial reports to evaluate LLMs on complex tasks like fraud detection and earnings forecasting, revealing that current models struggle significantly without specialized scaffolding and highlighting the need for more realistic evaluation frameworks.

Issa Sugiura, Takashi Ishida, Taro Makino + 4 more2026-03-06💻 cs

SPEED-RL: Faster Training of Reasoning Models via Online Curriculum Learning

The paper introduces SPEED-RL, an adaptive online curriculum learning method that selectively samples intermediate-difficulty prompts to theoretically and empirically accelerate reinforcement learning training for reasoning models by 2x to 6x without compromising accuracy.

Ruiqi Zhang, Daman Arora, Song Mei + 1 more2026-03-06💻 cs

Bures-Wasserstein Flow Matching for Graph Generation

This paper introduces BWFlow, a graph generation framework that overcomes the limitations of independent node-edge modeling by utilizing Bures-Wasserstein optimal transport on Markov random fields to construct a smooth, theoretically grounded probability path for the joint evolution of graph components, resulting in improved training convergence and sampling efficiency.

Keyue Jiang, Jiahao Cui, Xiaowen Dong + 1 more2026-03-06💻 cs

From Bandit Regret to FDR Control: Online Selective Generation with Adversarial Feedback Unlocking

This paper proposes ExSUL, a novel online learning framework that enables selective generation for large language models to robustly control the False Discovery Rate (FDR) and achieve optimal regret bounds in non-stationary and adversarial environments by converting bandit regret into FDR guarantees and unlocking additional learning signals from partial user feedback.

Minjae Lee, Yoonjae Jung, Sangdon Park2026-03-06💻 cs

Structured Kolmogorov-Arnold Neural ODEs for Interpretable Learning and Symbolic Discovery of Nonlinear Dynamics

This paper introduces Structured Kolmogorov-Arnold Neural ODEs (SKANODEs), a framework that combines structured state-space modeling with Kolmogorov-Arnold Networks to accurately recover interpretable physical latent states and discover compact symbolic governing equations for nonlinear dynamical systems, outperforming black-box neural ODEs and classical identification methods across synthetic and real-world datasets.

Wei Liu, Kiran Bacsa, Loon Ching Tang + 1 more2026-03-06🔬 physics

Learning Physical Systems: Symplectification via Gauge Fixing in Dirac Structures

This paper introduces Presymplectification Networks (PSNs), a novel framework that restores non-degenerate symplectic geometry for constrained and dissipative mechanical systems by learning a symplectification lift via Dirac structures, thereby enabling accurate, structure-preserving long-term prediction of complex multibody dynamics like those of the ANYmal quadruped robot.

Aristotelis Papatheodorou, Pranav Vaidhyanathan, Natalia Ares + 1 more2026-03-06💻 cs

Parameter Stress Analysis in Reinforcement Learning: Applying Synaptic Filtering to Policy Networks

This paper introduces a dual-stress framework combining synaptic filtering and adversarial attacks to classify and quantify the fragility, robustness, and antifragility of parameters in PPO-trained RL agents, revealing that targeted filtering can enhance policy adaptability in continuous control environments.

Zain ul Abdeen, Ming Jin2026-03-06💻 cs

MuRating: A High Quality Data Selecting Approach to Multilingual Large Language Model Pretraining

MuRating is a scalable framework that transfers high-quality English data-quality signals to a unified multilingual evaluator via pairwise comparisons and translation, enabling the selection of balanced, high-quality datasets that significantly improve the performance of multilingual large language models on both English and non-English benchmarks.

Zhixun Chen, Ping Guo, Wenhan Han + 10 more2026-03-06💻 cs

Overtone: Cyclic Patch Modulation for Clean, Efficient, and Flexible Physics Emulators

Overtone is a unified framework for transformer-based PDE surrogates that employs cyclic patch size modulation via architecture-agnostic modules to dynamically distribute harmonic errors and adapt computational costs, achieving significantly lower long-term rollout errors and flexible efficiency compared to static-patch baselines.

Payel Mukhopadhyay, Michael McCabe, Ruben Ohana + 1 more2026-03-06💻 cs

Some Super-approximation Rates of ReLU Neural Networks for Korobov Functions

This paper establishes nearly optimal super-approximation error bounds of order $2m$ and $2m-2$ in $L_p$ and $W^1_p$ norms, respectively, for ReLU neural networks approximating Korobov functions by leveraging sparse grid finite elements and bit extraction, thereby demonstrating that neural network expressivity effectively overcomes the curse of dimensionality.

Yuwen Li, Guozhi Zhang2026-03-06💻 cs

Kernel Based Maximum Entropy Inverse Reinforcement Learning for Mean-Field Games

This paper proposes a kernel-based maximum causal entropy inverse reinforcement learning framework for infinite-horizon stationary mean-field games that models unknown rewards in a reproducing kernel Hilbert space to capture nonlinear structures, proves the algorithm's theoretical consistency via Fréchet differentiability, and demonstrates superior policy recovery performance over linear baselines in traffic routing scenarios while extending the approach to finite-horizon non-stationary settings.

Berkay Anahtarci, Can Deha Kariksiz, Naci Saldi2026-03-06🔢 math

Elucidating the Design Space of Arbitrary-Noise-Based Diffusion Models

This paper proposes EDA, a unified theoretical framework that extends diffusion models to handle arbitrary noise patterns without increasing computational overhead, thereby significantly improving performance and generalization in diverse image restoration tasks such as medical imaging and natural scene recovery.

Xingyu Qiu, Mengying Yang, Xinghua Ma + 6 more2026-03-06💻 cs

Structured quantum learning via em algorithm for Boltzmann machines

This paper proposes a quantum version of the EM algorithm for training semi-quantum restricted Boltzmann machines, demonstrating that this information-geometric approach effectively circumvents the barren plateau problem and outperforms gradient-based methods in quantum generative modeling.

Takeshi Kimura, Kohtaro Kato, Masahito Hayashi2026-03-06⚛️ quant-ph

TIC-GRPO: Provable and Efficient Optimization for Reinforcement Learning from Human Feedback

This paper introduces TIC-GRPO, a provably convergent and more efficient variant of the critic-free GRPO algorithm that replaces token-level importance sampling with trajectory-level correction to better estimate current policy gradients, demonstrating superior performance on math and coding tasks.

Lei Pang, Jun Luo, Ruinan Jin2026-03-06💻 cs

Honest and Reliable Evaluation and Expert Equivalence Testing of Automated Neonatal Seizure Detection

This study proposes a rigorous evaluation framework for automated neonatal seizure detection that addresses current metric inconsistencies by recommending balanced metrics, comprehensive sensitivity/specificity reporting, and multi-rater Turing tests to ensure reliable, expert-level validation for clinical adoption.

Jovana Kljajic, John M. O'Toole, Robert Hogan + 1 more2026-03-06💻 cs

In-Training Defenses against Emergent Misalignment in Language Models

This paper presents the first systematic study of in-training safeguards against emergent misalignment in fine-tuned language models, demonstrating that interleaving training examples selected by the perplexity gap between aligned and misaligned models effectively prevents broad misalignment while preserving task performance and coherence.

David Kaczér, Magnus Jørgenvåg, Clemens Vetter + 4 more2026-03-06💻 cs

Dropping Just a Handful of Preferences Can Change Top Large Language Model Rankings

This paper introduces a fast method to evaluate the robustness of LLM rankings, revealing that top model positions in crowdsourced platforms like Chatbot Arena are surprisingly sensitive to the removal of a tiny fraction of preference data, whereas rankings from expert-annotated benchmarks like MT-bench remain more stable.

Jenny Y. Huang, Yunyi Shen, Dennis Wei + 1 more2026-03-06💻 cs

How Quantization Shapes Bias in Large Language Models

This study comprehensively evaluates how weight and activation quantization influences various forms of bias in large language models, revealing that while it can reduce toxicity and preserve sentiment, it often exacerbates stereotypes and unfairness in generative tasks, particularly under aggressive compression.

Federico Marcuzzi, Xuefei Ning, Roy Schwartz + 1 more2026-03-06💻 cs

Multi-Agent Reinforcement Learning in Intelligent Transportation Systems: A Comprehensive Survey

This paper presents a comprehensive survey of Multi-Agent Reinforcement Learning applications in Intelligent Transportation Systems, offering a structured taxonomy of algorithms and domains, reviewing key simulation platforms, and identifying critical challenges hindering real-world deployment.

Rexcharles Donatus, Kumater Ter, Daniel Udekwe2026-03-06💻 cs

← Previous Next →