cs.LG papers | Gist.Science

Quality over Quantity: Demonstration Curation via Influence Functions for Data-Centric Robot Learning

This paper introduces Quality over Quantity (QoQ), a systematic framework that leverages influence functions to automatically curate high-quality robot learning demonstrations by quantifying each sample's contribution to reducing validation loss, thereby significantly improving policy performance over manual or heuristic data selection methods.

Haeone Lee, Taywon Min, Junsu Kim, Sinjae Kang, Fangchen Liu, Lerrel Pinto, Kimin Lee2026-03-11🤖 cs.LG

Adaptive Active Learning for Online Reliability Prediction of Satellite Electronics

This paper proposes a novel integrated online reliability prediction framework for satellite electronics that combines a Wiener process-based degradation model with a two-stage adaptive active learning strategy to significantly improve prediction accuracy while reducing data requirements under limited and variable operational conditions.

Shixiang Li, Yubin Tian, Dianpeng Wang, Piao Chen, Mengying Ren2026-03-11🤖 cs.LG

Dynamic Multi-period Experts for Online Time Series Forecasting

This paper introduces DynaME, a novel hybrid framework for online time series forecasting that redefines concept drift into recurring and emergent types, utilizing specialized historical experts for the former and a stable general expert for the latter to significantly outperform existing baselines.

Seungha Hong, Sukang Chae, Suyeon Kim, Sanghwan Jang, Hwanjo Yu2026-03-11🤖 cs.LG

Learning Adaptive LLM Decoding

This paper proposes learning lightweight, reinforcement-trained decoding adapters that dynamically select sampling strategies at both the sequence and token levels based on prompt features and compute budgets, significantly improving the accuracy-efficiency tradeoff on math and coding benchmarks compared to fixed hyperparameter baselines.

Chloe H. Su, Zhe Ye, Samuel Tenka, Aidan Yang, Soonho Kong, Udaya Ghai2026-03-11🤖 cs.LG

Verifying Good Regulator Conditions for Hypergraph Observers: Natural Gradient Learning from Causal Invariance via Established Theorems

This paper verifies that persistent observers in causally invariant hypergraph substrates satisfy the Conant-Ashby Good Regulator Theorem, thereby necessitating internal models that lead to natural gradient descent as the unique learning rule and yielding a model-dependent closed-form formula for Vanchurin's regime parameter $\alpha$ with a quantum-classical threshold at $\kappa(F)=2$ .

Max Zhuravlev2026-03-11🤖 cs.LG

Exclusive Self Attention

The paper introduces Exclusive Self Attention (XSA), a modification that constrains attention to information orthogonal to a token's own value vector, thereby improving Transformer performance in language modeling tasks, particularly as sequence length increases.

Shuangfei Zhai2026-03-11🤖 cs.LG

PPO-Based Hybrid Optimization for RIS-Assisted Semantic Vehicular Edge Computing

This paper proposes a Reconfigurable Intelligent Surface (RIS)-aided semantic-aware Vehicle Edge Computing framework that utilizes a Proximal Policy Optimization (PPO) and Linear Programming (LP) hybrid scheme to jointly optimize offloading ratios, semantic symbols, and RIS phase shifts, achieving a 40–50% reduction in end-to-end latency compared to existing methods.

Wei Feng, Jingbo Zhang, Qiong Wu, Pingyi Fan, Qiang Fan2026-03-11🤖 cs.LG

Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum Price Forecasting

This study demonstrates that integrating sentiment scores derived from a finetuned Qwen3 model analyzing English and Chinese news significantly enhances aluminum price forecasting accuracy and economic utility, particularly during periods of high market volatility, compared to traditional tabular data models.

Alvaro Paredes Amorin, Andre Python, Christoph Weisser2026-03-11🤖 cs.AI

Latent World Models for Automated Driving: A Unified Taxonomy, Evaluation Framework, and Open Challenges

This paper proposes a unified taxonomy and evaluation framework for latent world models in automated driving, organizing design choices by latent representations and structural priors while identifying key internal mechanics and research directions to enhance robustness, generalization, and deployability.

Rongxiang Zeng, Yongqi Dong2026-03-11🤖 cs.AI

Overcoming Valid Action Suppression in Unmasked Policy Gradient Algorithms

This paper identifies and theoretically proves that unmasked policy gradient algorithms systematically suppress valid actions at unvisited states due to parameter sharing and gradient propagation, a failure mode that action masking avoids and that can be mitigated in unmasked settings through feasibility classification.

Renos Zabounidis, Roy Siegelmann, Mohamad Qadri, Woojun Kim, Simon Stepputtis, Katia P. Sycara2026-03-11🤖 cs.LG

Probabilistic Hysteresis Factor Prediction for Electric Vehicle Batteries with Graphite Anodes Containing Silicon

This paper proposes a data-driven framework that harmonizes heterogeneous driving cycle data and employs statistical and deep learning models to enable efficient, probabilistic prediction of voltage hysteresis factors in silicon-graphite anode batteries, thereby improving state-of-charge estimation and generalizability across different vehicle models.

Runyao Yu, Viviana Kleine, Philipp Gromotka, Thomas Rudolf, Adrian Eisenmann, Gautham Ram Chandra Mouli, Peter Palensky, Jochen L. Cremer2026-03-11🤖 cs.LG

Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards

This paper introduces DCPO, a framework that resolves the inherent gradient conflict between accuracy and calibration in Reinforcement Learning from Verifiable Rewards by decoupling reasoning and confidence objectives, thereby achieving state-of-the-art calibration performance without compromising model accuracy.

Zhengzhao Ma, Xueru Wen, Boxi Cao, Yaojie Lu, Hongyu Lin, Jinglin Yang, Min He, Xianpei Han, Le Sun2026-03-11🤖 cs.LG

Causally Sufficient and Necessary Feature Expansion for Class-Incremental Learning

This paper proposes a Probability of Necessity and Sufficiency (PNS)-based regularization method for Class-Incremental Learning that utilizes a dual-scope counterfactual generator to mitigate feature collisions caused by intra-task shortcut reliance and inter-task semantic confusion, thereby ensuring both the causal completeness and separability of task-specific representations.

Zhen Zhang, Jielei Chu, Tianrui Li2026-03-11🤖 cs.AI

RubiCap: Rubric-Guided Reinforcement Learning for Dense Image Captioning

RubiCap introduces a novel reinforcement learning framework that leverages LLM-generated rubrics to create structured, multi-faceted reward signals for dense image captioning, thereby overcoming the limitations of supervised distillation and deterministic checkers to achieve state-of-the-art performance and superior word efficiency across various benchmarks.

Tzu-Heng Huang, Sirajul Salekin, Javier Movellan, Frederic Sala, Manjot Bilkhu2026-03-11🤖 cs.AI

Wrong Code, Right Structure: Learning Netlist Representations from Imperfect LLM-Generated RTL

This paper proposes a cost-effective framework that leverages structurally informative but functionally imperfect LLM-generated RTL to train netlist representation models, effectively overcoming data scarcity and outperforming methods reliant on scarce high-quality labeled datasets.

Siyang Cai, Cangyuan Li, Yinhe Han, Ying Wang2026-03-11🤖 cs.AI

GIAT: A Geologically-Informed Attention Transformer for Lithology Identification

The paper proposes GIAT, a novel Geologically-Informed Attention Transformer that integrates Category-Wise Sequence Correlation filters into the self-attention mechanism to guide lithology identification with geological priors, achieving state-of-the-art accuracy and enhanced interpretability on well log datasets.

Jie Li, Qishun Yang, Nuo Li2026-03-11🤖 cs.AI

Better Bounds for the Distributed Experts Problem

This paper presents an improved distributed protocol for the distributed experts problem that achieves a specific regret bound while significantly reducing communication costs compared to previous work.

David P. Woodruff, Samson Zhou2026-03-11🤖 cs.LG

Differentiable Stochastic Traffic Dynamics: Physics-Informed Generative Modelling in Transportation

This paper proposes a physics-informed generative modeling framework that derives a differentiable, distributional traffic dynamics model from stochastic Ito-type equations, enabling the estimation of traffic density distributions, credible intervals, and congestion risks through a score network trained with denoising score matching and Fokker-Planck residual loss.

Wuping Xin2026-03-11🤖 cs.AI

Latent-DARM: Bridging Discrete Diffusion And Autoregressive Models For Reasoning

Latent-DARM is a novel latent-space communication framework that bridges Discrete Diffusion Language Models for global planning and Autoregressive Models for fluent execution, significantly improving reasoning accuracy on benchmarks like DART-5 and AIME2024 while drastically reducing token usage compared to state-of-the-art reasoning models.

Lina Berrayana, Ahmed Heakl, Abdullah Sohail, Thomas Hofmann, Salman Khan, Wei Chen2026-03-11🤖 cs.AI

The Costs of Reproducibility in Music Separation Research: a Replication of Band-Split RNN

This paper addresses the reproducibility crisis in music source separation by attempting to replicate the Band-Split RNN model, ultimately releasing an optimized version with improved performance and publicly available code to advocate for more transparent and sustainable research practices.

Paul Magron, Romain Serizel, Constance Douwes2026-03-11🤖 cs.LG

← Previous Next →