cs.LG papers | Gist.Science

MoE Lens -- An Expert Is All You Need

This paper analyzes the DeepSeekMoE model to reveal that Mixture of Experts architectures exhibit highly concentrated specialization where a single dominant expert can approximate full ensemble performance, suggesting significant opportunities for inference optimization through targeted expert pruning.

Marmik Chaudhari, Idhant Gulati, Nishkal Hundia, Pranav Karra, Shivam Raval2026-03-09🤖 cs.LG

Margin and Consistency Supervision for Calibrated and Robust Vision Models

This paper introduces Margin and Consistency Supervision (MaCS), an architecture-agnostic regularization framework that combines a hinge-squared margin penalty and a consistency regularizer to simultaneously enhance the calibration, robustness, and generalization of deep vision models without requiring additional data or architectural changes.

Salim Khazem2026-03-09🤖 cs.AI

Self-Auditing Parameter-Efficient Fine-Tuning for Few-Shot 3D Medical Image Segmentation

The paper introduces SEA-PEFT, a self-auditing parameter-efficient fine-tuning framework that automates adapter configuration through a dynamic search-audit-allocate loop, achieving significant performance improvements in few-shot 3D medical image segmentation without requiring manual design or extensive computational resources.

Son Thai Ly, Hien V. Nguyen2026-03-09🤖 cs.LG

Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls

This paper empirically evaluates the effectiveness and limitations of many-shot prompting for test-time adaptation in large language models, finding that while it benefits structured tasks with high information gain, its performance is highly sensitive to selection strategies and often yields limited improvements for open-ended generation.

Shubhangi Upasani, Chen Wu, Jay Rainton, Bo Li, Changran Hu, Qizheng Zhang, Urmish Thakker2026-03-09🤖 cs.LG

ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning

ReflexiCoder is a novel reinforcement learning framework that internalizes structured self-reflection and self-correction capabilities into an LLM's weights, enabling it to autonomously generate, debug, and optimize code without external feedback while achieving state-of-the-art performance and improved token efficiency across multiple benchmarks.

Juyong Jiang, Jiasi Shen, Sunghun Kim, Kang Min Yoo, Jeonghoon Kim, Sungju Kim2026-03-09🤖 cs.LG

Stochastic Event Prediction via Temporal Motif Transitions

The paper introduces STEP, a continuous-time stochastic framework that models temporal link prediction as a sequential forecasting problem via Poisson-driven motif transitions, achieving significant precision gains and lower runtime compared to state-of-the-art baselines while remaining compatible with existing graph neural networks.

\.Ibrahim Bahadır Altun, Ahmet Erdem Sarıyüce2026-03-09🤖 cs.LG

ROSE: Reordered SparseGPT for More Accurate One-Shot Large Language Models Pruning

This paper proposes ROSE, a reordered SparseGPT method that enhances one-shot LLM pruning accuracy by adaptively reordering weights based on estimated column and block pruning losses to address the suboptimal performance caused by predefined left-to-right pruning orders in layers with columnar patterns.

Mingluo Su, Huan Wang2026-03-09🤖 cs.LG

PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction

PixARMesh is a novel autoregressive method that directly reconstructs complete, high-fidelity, and artist-ready 3D indoor scene meshes from a single RGB image by jointly predicting object layout and geometry within a unified model, eliminating the need for implicit fields or post-hoc optimization.

Xiang Zhang, Sohyun Yoo, Hongrui Wu, Chuan Li, Jianwen Xie, Zhuowen Tu2026-03-09🤖 cs.LG

Mitigating Bias in Concept Bottleneck Models for Fair and Interpretable Image Classification

This paper proposes three bias mitigation techniques—top-k concept filtering, removal of biased concepts, and adversarial debiasing—to address information leakage in Concept Bottleneck Models, thereby achieving superior fairness-performance tradeoffs for interpretable image classification compared to prior work.

Schrasing Tong, Antoine Salaun, Vincent Yuan, Annabel Adeyeri, Lalana Kagal2026-03-09🤖 cs.LG

Reference-guided Policy Optimization for Molecular Optimization via LLM Reasoning

This paper introduces Reference-guided Policy Optimization (RePO), a novel framework that combines reinforcement learning with verifiable rewards and supervised reference guidance to effectively balance exploration and exploitation in molecular optimization tasks where only single-reference data is available, thereby outperforming existing SFT and RLVR baselines.

Xuan Li, Zhanke Zhou, Zongze Li, Jiangchao Yao, Yu Rong, Lu Zhang, Bo Han2026-03-09🤖 cs.AI

Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis

This paper proposes an integrated framework combining a node transformer architecture with BERT-based sentiment analysis to model stock market graphs and social media sentiment, demonstrating superior forecasting accuracy (0.80% MAPE) and directional precision compared to traditional ARIMA and LSTM models across 20 S&P 500 stocks from 1982 to 2025.

Mohammad Al Ridhawi, Mahtab Haj Ali, Hussein Al Osman2026-03-09🤖 cs.AI

Design Experiments to Compare Multi-armed Bandit Algorithms

This paper proposes "Artificial Replay," a new experimental design that reuses recorded rewards from a single policy trajectory to enable unbiased, cost-effective, and low-variance comparisons between multi-armed bandit algorithms, thereby significantly reducing the number of required user interactions compared to traditional independent restarts.

Huiling Meng, Ningyuan Chen, Xuefeng Gao2026-03-09🤖 cs.LG

Weak-SIGReg: Covariance Regularization for Stable Deep Learning

This paper introduces Weak-SIGReg, a computationally efficient covariance regularization method derived from Sketched Isotropic Gaussian Regularization that stabilizes deep learning optimization and prevents representation collapse in low-bias architectures like Vision Transformers without relying on standard architectural priors.

Habibullah Akbar2026-03-09🤖 cs.LG

Addressing the Ecological Fallacy in Larger LMs with Human Context

This paper demonstrates that addressing the ecological fallacy by modeling an author's language context through a specific task called HuLM, particularly during fine-tuning (HuFT) or continued pre-training, significantly improves the performance of an 8B Llama model across multiple downstream tasks compared to standard training methods.

Nikita Soni, Dhruv Vijay Kunjadiya, Pratham Piyush Shah, Dikshya Mohanty, H. Andrew Schwartz, Niranjan Balasubramanian2026-03-09🤖 cs.AI

A Persistent-State Dataflow Accelerator for Memory-Bound Linear Attention Decode on FPGA

This paper presents an FPGA accelerator that eliminates the memory-bound bottleneck of Gated DeltaNet decode by persistently storing the full recurrent state in on-chip BRAM, achieving 4.5 $\times$ faster inference and up to 60 $\times$ higher energy efficiency compared to an NVIDIA H100 GPU.

Neelesh Gupta, Peter Wang, Rajgopal Kannan, Viktor K. Prasanna2026-03-09🤖 cs.LG

Implicit Style Conditioning: A Structured Style-Rewrite Framework for Low-Resource Character Modeling

This paper proposes a Structured Style-Rewrite Framework that combines explicit disentanglement of stylistic dimensions with implicit Chain-of-Thought conditioning to enable small language models to achieve high-fidelity, consistent character role-playing without requiring explicit reasoning tokens during inference.

Chanhui Zhu2026-03-09🤖 cs.LG

Who We Are, Where We Are: Mental Health at the Intersection of Person, Situation, and Large Language Models

This paper proposes an interpretable modeling approach that integrates person-level psychological traits with situational context features derived from social media data to predict dynamic mental well-being, demonstrating that theory-driven methods offer competitive performance and greater human-understandable insights compared to standard language model embeddings.

Nikita Soni, August Håkan Nilsson, Syeda Mahwish, Vasudha Varadarajan, H. Andrew Schwartz, Ryan L. Boyd2026-03-09🤖 cs.AI

Omni-Masked Gradient Descent: Memory-Efficient Optimization via Mask Traversal with Improved Convergence

The paper proposes Omni-Masked Gradient Descent (OMGD), a memory-efficient optimization method that achieves a strictly improved nonconvex convergence rate of $\tilde{\mathcal{O}}(\epsilon^{-3})$ and demonstrates consistent empirical improvements in large language model training tasks.

Hui Yang, Tao Ren, Jinyang Jiang, Wan Tian, Yijie Peng2026-03-09🤖 cs.LG

TADPO: Reinforcement Learning Goes Off-road

This paper introduces TADPO, a novel reinforcement learning framework that extends Proximal Policy Optimization with off-policy teacher guidance and on-policy student exploration to enable zero-shot sim-to-real, high-speed autonomous driving on full-scale off-road vehicles navigating complex, unmapped terrain.

Zhouchonghao Wu, Raymond Song, Vedant Mundheda, Luis E. Navarro-Serment, Christof Schoenborn, Jeff Schneider2026-03-09🤖 cs.AI

EvoESAP: Non-Uniform Expert Pruning for Sparse MoE

The paper introduces EvoESAP, an evolutionary framework that optimizes non-uniform layer-wise sparsity allocations for Sparse Mixture-of-Experts models using a stable, speculative-decoding-inspired metric called ESAP, significantly improving open-ended generation performance while maintaining competitive accuracy compared to traditional uniform pruning methods.

Zongfang Liu, Shengkun Tang, Boyang Sun, Zhiqiang Shen, Xin Yuan2026-03-09🤖 cs.LG

← Previous Next →