cs.LG papers | Gist.Science

Mitigating Bias in Concept Bottleneck Models for Fair and Interpretable Image Classification

This paper proposes three bias mitigation techniques—top-k concept filtering, removal of biased concepts, and adversarial debiasing—to address information leakage in Concept Bottleneck Models, thereby achieving superior fairness-performance tradeoffs for interpretable image classification compared to prior work.

Schrasing Tong, Antoine Salaun, Vincent Yuan, Annabel Adeyeri, Lalana Kagal2026-03-09🤖 cs.LG

Reference-guided Policy Optimization for Molecular Optimization via LLM Reasoning

This paper introduces Reference-guided Policy Optimization (RePO), a novel framework that combines reinforcement learning with verifiable rewards and supervised reference guidance to effectively balance exploration and exploitation in molecular optimization tasks where only single-reference data is available, thereby outperforming existing SFT and RLVR baselines.

Xuan Li, Zhanke Zhou, Zongze Li, Jiangchao Yao, Yu Rong, Lu Zhang, Bo Han2026-03-09🤖 cs.AI

Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis

This paper proposes an integrated framework combining a node transformer architecture with BERT-based sentiment analysis to model stock market graphs and social media sentiment, demonstrating superior forecasting accuracy (0.80% MAPE) and directional precision compared to traditional ARIMA and LSTM models across 20 S&P 500 stocks from 1982 to 2025.

Mohammad Al Ridhawi, Mahtab Haj Ali, Hussein Al Osman2026-03-09🤖 cs.AI

Design Experiments to Compare Multi-armed Bandit Algorithms

This paper proposes "Artificial Replay," a new experimental design that reuses recorded rewards from a single policy trajectory to enable unbiased, cost-effective, and low-variance comparisons between multi-armed bandit algorithms, thereby significantly reducing the number of required user interactions compared to traditional independent restarts.

Huiling Meng, Ningyuan Chen, Xuefeng Gao2026-03-09🤖 cs.LG

Weak-SIGReg: Covariance Regularization for Stable Deep Learning

This paper introduces Weak-SIGReg, a computationally efficient covariance regularization method derived from Sketched Isotropic Gaussian Regularization that stabilizes deep learning optimization and prevents representation collapse in low-bias architectures like Vision Transformers without relying on standard architectural priors.

Habibullah Akbar2026-03-09🤖 cs.LG

Addressing the Ecological Fallacy in Larger LMs with Human Context

This paper demonstrates that addressing the ecological fallacy by modeling an author's language context through a specific task called HuLM, particularly during fine-tuning (HuFT) or continued pre-training, significantly improves the performance of an 8B Llama model across multiple downstream tasks compared to standard training methods.

Nikita Soni, Dhruv Vijay Kunjadiya, Pratham Piyush Shah, Dikshya Mohanty, H. Andrew Schwartz, Niranjan Balasubramanian2026-03-09🤖 cs.AI

A Persistent-State Dataflow Accelerator for Memory-Bound Linear Attention Decode on FPGA

This paper presents an FPGA accelerator that eliminates the memory-bound bottleneck of Gated DeltaNet decode by persistently storing the full recurrent state in on-chip BRAM, achieving 4.5 $\times$ faster inference and up to 60 $\times$ higher energy efficiency compared to an NVIDIA H100 GPU.

Neelesh Gupta, Peter Wang, Rajgopal Kannan, Viktor K. Prasanna2026-03-09🤖 cs.LG

Implicit Style Conditioning: A Structured Style-Rewrite Framework for Low-Resource Character Modeling

This paper proposes a Structured Style-Rewrite Framework that combines explicit disentanglement of stylistic dimensions with implicit Chain-of-Thought conditioning to enable small language models to achieve high-fidelity, consistent character role-playing without requiring explicit reasoning tokens during inference.

Chanhui Zhu2026-03-09🤖 cs.LG

Who We Are, Where We Are: Mental Health at the Intersection of Person, Situation, and Large Language Models

This paper proposes an interpretable modeling approach that integrates person-level psychological traits with situational context features derived from social media data to predict dynamic mental well-being, demonstrating that theory-driven methods offer competitive performance and greater human-understandable insights compared to standard language model embeddings.

Nikita Soni, August Håkan Nilsson, Syeda Mahwish, Vasudha Varadarajan, H. Andrew Schwartz, Ryan L. Boyd2026-03-09🤖 cs.AI

Omni-Masked Gradient Descent: Memory-Efficient Optimization via Mask Traversal with Improved Convergence

The paper proposes Omni-Masked Gradient Descent (OMGD), a memory-efficient optimization method that achieves a strictly improved nonconvex convergence rate of $\tilde{\mathcal{O}}(\epsilon^{-3})$ and demonstrates consistent empirical improvements in large language model training tasks.

Hui Yang, Tao Ren, Jinyang Jiang, Wan Tian, Yijie Peng2026-03-09🤖 cs.LG

TADPO: Reinforcement Learning Goes Off-road

This paper introduces TADPO, a novel reinforcement learning framework that extends Proximal Policy Optimization with off-policy teacher guidance and on-policy student exploration to enable zero-shot sim-to-real, high-speed autonomous driving on full-scale off-road vehicles navigating complex, unmapped terrain.

Zhouchonghao Wu, Raymond Song, Vedant Mundheda, Luis E. Navarro-Serment, Christof Schoenborn, Jeff Schneider2026-03-09🤖 cs.AI

EvoESAP: Non-Uniform Expert Pruning for Sparse MoE

The paper introduces EvoESAP, an evolutionary framework that optimizes non-uniform layer-wise sparsity allocations for Sparse Mixture-of-Experts models using a stable, speculative-decoding-inspired metric called ESAP, significantly improving open-ended generation performance while maintaining competitive accuracy compared to traditional uniform pruning methods.

Zongfang Liu, Shengkun Tang, Boyang Sun, Zhiqiang Shen, Xin Yuan2026-03-09🤖 cs.LG

Preventing Learning Stagnation in PPO by Scaling to 1 Million Parallel Environments

This paper identifies that learning stagnation in PPO arises from poor sample-based loss estimates due to excessive step sizes relative to gradient noise, proposing that scaling to over one million parallel environments effectively mitigates this issue and enables monotonic performance improvements up to one trillion transitions.

Michael Beukman, Khimya Khetarpal, Zeyu Zheng, Will Dabney, Jakob Foerster, Michael Dennis, Clare Lyle2026-03-09🤖 cs.LG

Agnostic learning in (almost) optimal time via Gaussian surface area

This paper improves the known bounds for agnostic learning of concept classes with bounded Gaussian surface area by demonstrating that a polynomial degree of $\tilde{O}(\Gamma^2 / \varepsilon^2)$ suffices for $\varepsilon$ -approximation, thereby yielding near-optimal complexity for learning polynomial threshold functions in the statistical query model.

Lucas Pesenti, Lucas Slot, Manuel Wiedmer2026-03-09🤖 cs.LG

Improved high-dimensional estimation with Langevin dynamics and stochastic weight averaging

This paper demonstrates that Langevin dynamics combined with stochastic weight averaging can achieve optimal sample complexity of $n \gtrsim d^{k^\star/2}$ for recovering a hidden direction in high-dimensional settings like tensor PCA and single-index models, effectively emulating landscape smoothing without explicit regularization.

Stanley Wei, Alex Damian, Jason D. Lee2026-03-09🤖 cs.LG

TempoSyncDiff: Distilled Temporally-Consistent Diffusion for Low-Latency Audio-Driven Talking Head Generation

TempoSyncDiff is a reference-conditioned latent diffusion framework that employs teacher-student distillation and temporal regularization to enable low-latency, temporally stable, and identity-consistent audio-driven talking head generation suitable for edge deployment.

Soumya Mazumdar, Vineet Kumar Rakesh2026-03-09🤖 cs.AI

Latent Diffusion-Based 3D Molecular Recovery from Vibrational Spectra

This paper introduces IR-GeoDiff, a latent diffusion model that recovers three-dimensional molecular geometries from infrared spectra by integrating spectral information into molecular representations, thereby addressing the limitations of existing 2D approaches in capturing the relationship between spectral features and 3D structure.

Wenjin Wu, Aleš Leonardis, Linjiang Chen, Jianbo Jiao2026-03-09🤖 cs.LG

Dynamic Momentum Recalibration in Online Gradient Learning

This paper introduces SGDF, an optimizer that applies optimal linear filtering principles to dynamically adjust momentum coefficients in real-time, thereby minimizing mean-squared error to achieve a superior balance between noise suppression and signal preservation compared to conventional momentum methods.

Zhipeng Yao, Rui Yu, Guisong Chang, Ying Li, Yu Zhang, Dazhou Li2026-03-09🤖 cs.LG

Diffusion Language Models Are Natively Length-Aware

This paper proposes a zero-shot mechanism that leverages latent prompt representations to dynamically crop the fixed context window of Diffusion Language Models before generation, significantly reducing computational costs while maintaining or improving performance across diverse tasks.

Vittorio Rossi, Giacomo Cirò, Davide Beltrame, Luca Gandolfi, Paul Röttger, Dirk Hovy2026-03-09🤖 cs.LG

DQE: A Semantic-Aware Evaluation Metric for Time Series Anomaly Detection

This paper proposes DQE, a novel semantic-aware evaluation metric for time series anomaly detection that addresses existing limitations in bias, consistency, and false alarm penalization by introducing a semantic-based partitioning strategy and aggregating scores across the full threshold spectrum to provide more stable, discriminative, and interpretable assessments.

Yuewei Li, Dalin Zhang, Huan Li, Xinyi Gong, Hongjun Chu, Zhaohui Song2026-03-09🤖 cs.LG

← Previous Next →