cs.LG papers | Gist.Science

Bridging Domains through Subspace-Aware Model Merging

This paper introduces SCORE, a novel model merging method that resolves singular subspace conflicts between domain-specific models by projecting them into a shared orthogonal basis, thereby significantly improving generalization to unseen domains compared to existing approaches.

Levy Chaves, Chao Zhou, Rebekka Burkholz, Eduardo Valle, Sandra Avila2026-03-09🤖 cs.AI

Knowing without Acting: The Disentangled Geometry of Safety Mechanisms in Large Language Models

This paper proposes the Disentangled Safety Hypothesis (DSH), which reveals that large language models separate safety "recognition" and "refusal execution" into distinct geometric subspaces, enabling the development of the Refusal Erasure Attack (REA) to bypass safety mechanisms by surgically disabling the refusal axis while preserving harmful content generation.

Jinman Wu, Yi Xie, Shen Lin, Shiqian Zhao, Xiaofeng Chen2026-03-09🤖 cs.AI

First-Order Softmax Weighted Switching Gradient Method for Distributed Stochastic Minimax Optimization with Stochastic Constraints

This paper proposes a first-order Softmax-Weighted Switching Gradient method for distributed stochastic minimax optimization under stochastic constraints, achieving optimal oracle complexity and high-probability convergence guarantees in both full and partial client participation settings while avoiding the instability of traditional primal-dual approaches.

Zhankun Luo, Antesh Upadhyay, Sang Bin Moon, Abolfazl Hashemi2026-03-09🤖 cs.LG

The Coordination Gap: Alternation Metrics for Temporal Dynamics in Multi-Agent Battle of the Exes

This paper introduces temporally sensitive Alternation (ALT) metrics to reveal that conventional outcome-based evaluations can severely mischaracterize multi-agent coordination, as demonstrated by Q-learning agents in a Battle of the Exes variant that achieve high traditional fairness scores but perform significantly worse than random baselines in actual turn-taking dynamics.

Nikolaos Al. Papadopoulos, Konstantinos Psannis2026-03-09🤖 cs.LG

Sparse Crosscoders for diffing MoEs and Dense models

This paper utilizes crosscoders to demonstrate that Mixture of Experts (MoE) models develop more specialized, focused representations with fewer unique features compared to the broader, general-purpose feature distributions found in dense models of equivalent active parameter count.

Marmik Chaudhari, Nishkal Hundia, Idhant Gulati2026-03-09🤖 cs.LG

MoE Lens -- An Expert Is All You Need

This paper analyzes the DeepSeekMoE model to reveal that Mixture of Experts architectures exhibit highly concentrated specialization where a single dominant expert can approximate full ensemble performance, suggesting significant opportunities for inference optimization through targeted expert pruning.

Marmik Chaudhari, Idhant Gulati, Nishkal Hundia, Pranav Karra, Shivam Raval2026-03-09🤖 cs.LG

Margin and Consistency Supervision for Calibrated and Robust Vision Models

This paper introduces Margin and Consistency Supervision (MaCS), an architecture-agnostic regularization framework that combines a hinge-squared margin penalty and a consistency regularizer to simultaneously enhance the calibration, robustness, and generalization of deep vision models without requiring additional data or architectural changes.

Salim Khazem2026-03-09🤖 cs.AI

Self-Auditing Parameter-Efficient Fine-Tuning for Few-Shot 3D Medical Image Segmentation

The paper introduces SEA-PEFT, a self-auditing parameter-efficient fine-tuning framework that automates adapter configuration through a dynamic search-audit-allocate loop, achieving significant performance improvements in few-shot 3D medical image segmentation without requiring manual design or extensive computational resources.

Son Thai Ly, Hien V. Nguyen2026-03-09🤖 cs.LG

Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls

This paper empirically evaluates the effectiveness and limitations of many-shot prompting for test-time adaptation in large language models, finding that while it benefits structured tasks with high information gain, its performance is highly sensitive to selection strategies and often yields limited improvements for open-ended generation.

Shubhangi Upasani, Chen Wu, Jay Rainton, Bo Li, Changran Hu, Qizheng Zhang, Urmish Thakker2026-03-09🤖 cs.LG

ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning

ReflexiCoder is a novel reinforcement learning framework that internalizes structured self-reflection and self-correction capabilities into an LLM's weights, enabling it to autonomously generate, debug, and optimize code without external feedback while achieving state-of-the-art performance and improved token efficiency across multiple benchmarks.

Juyong Jiang, Jiasi Shen, Sunghun Kim, Kang Min Yoo, Jeonghoon Kim, Sungju Kim2026-03-09🤖 cs.LG

Stochastic Event Prediction via Temporal Motif Transitions

The paper introduces STEP, a continuous-time stochastic framework that models temporal link prediction as a sequential forecasting problem via Poisson-driven motif transitions, achieving significant precision gains and lower runtime compared to state-of-the-art baselines while remaining compatible with existing graph neural networks.

\.Ibrahim Bahadır Altun, Ahmet Erdem Sarıyüce2026-03-09🤖 cs.LG

ROSE: Reordered SparseGPT for More Accurate One-Shot Large Language Models Pruning

This paper proposes ROSE, a reordered SparseGPT method that enhances one-shot LLM pruning accuracy by adaptively reordering weights based on estimated column and block pruning losses to address the suboptimal performance caused by predefined left-to-right pruning orders in layers with columnar patterns.

Mingluo Su, Huan Wang2026-03-09🤖 cs.LG

PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction

PixARMesh is a novel autoregressive method that directly reconstructs complete, high-fidelity, and artist-ready 3D indoor scene meshes from a single RGB image by jointly predicting object layout and geometry within a unified model, eliminating the need for implicit fields or post-hoc optimization.

Xiang Zhang, Sohyun Yoo, Hongrui Wu, Chuan Li, Jianwen Xie, Zhuowen Tu2026-03-09🤖 cs.LG

Mitigating Bias in Concept Bottleneck Models for Fair and Interpretable Image Classification

This paper proposes three bias mitigation techniques—top-k concept filtering, removal of biased concepts, and adversarial debiasing—to address information leakage in Concept Bottleneck Models, thereby achieving superior fairness-performance tradeoffs for interpretable image classification compared to prior work.

Schrasing Tong, Antoine Salaun, Vincent Yuan, Annabel Adeyeri, Lalana Kagal2026-03-09🤖 cs.LG

Reference-guided Policy Optimization for Molecular Optimization via LLM Reasoning

This paper introduces Reference-guided Policy Optimization (RePO), a novel framework that combines reinforcement learning with verifiable rewards and supervised reference guidance to effectively balance exploration and exploitation in molecular optimization tasks where only single-reference data is available, thereby outperforming existing SFT and RLVR baselines.

Xuan Li, Zhanke Zhou, Zongze Li, Jiangchao Yao, Yu Rong, Lu Zhang, Bo Han2026-03-09🤖 cs.AI

Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis

This paper proposes an integrated framework combining a node transformer architecture with BERT-based sentiment analysis to model stock market graphs and social media sentiment, demonstrating superior forecasting accuracy (0.80% MAPE) and directional precision compared to traditional ARIMA and LSTM models across 20 S&P 500 stocks from 1982 to 2025.

Mohammad Al Ridhawi, Mahtab Haj Ali, Hussein Al Osman2026-03-09🤖 cs.AI

Design Experiments to Compare Multi-armed Bandit Algorithms

This paper proposes "Artificial Replay," a new experimental design that reuses recorded rewards from a single policy trajectory to enable unbiased, cost-effective, and low-variance comparisons between multi-armed bandit algorithms, thereby significantly reducing the number of required user interactions compared to traditional independent restarts.

Huiling Meng, Ningyuan Chen, Xuefeng Gao2026-03-09🤖 cs.LG

Weak-SIGReg: Covariance Regularization for Stable Deep Learning

This paper introduces Weak-SIGReg, a computationally efficient covariance regularization method derived from Sketched Isotropic Gaussian Regularization that stabilizes deep learning optimization and prevents representation collapse in low-bias architectures like Vision Transformers without relying on standard architectural priors.

Habibullah Akbar2026-03-09🤖 cs.LG

Addressing the Ecological Fallacy in Larger LMs with Human Context

This paper demonstrates that addressing the ecological fallacy by modeling an author's language context through a specific task called HuLM, particularly during fine-tuning (HuFT) or continued pre-training, significantly improves the performance of an 8B Llama model across multiple downstream tasks compared to standard training methods.

Nikita Soni, Dhruv Vijay Kunjadiya, Pratham Piyush Shah, Dikshya Mohanty, H. Andrew Schwartz, Niranjan Balasubramanian2026-03-09🤖 cs.AI

A Persistent-State Dataflow Accelerator for Memory-Bound Linear Attention Decode on FPGA

This paper presents an FPGA accelerator that eliminates the memory-bound bottleneck of Gated DeltaNet decode by persistently storing the full recurrent state in on-chip BRAM, achieving 4.5 $\times$ faster inference and up to 60 $\times$ higher energy efficiency compared to an NVIDIA H100 GPU.

Neelesh Gupta, Peter Wang, Rajgopal Kannan, Viktor K. Prasanna2026-03-09🤖 cs.LG

← Previous Next →