cs.LG papers | Gist.Science

Why Is RLHF Alignment Shallow? A Gradient Analysis

This paper proves that standard RLHF alignment is inherently shallow because gradient signals vanish once a sequence's harmfulness is determined, and it proposes a recovery penalty objective to ensure alignment gradients persist throughout the entire generation process.

Robin Young2026-03-06🤖 cs.LG

Osmosis Distillation: Model Hijacking with the Fewest Samples

This paper introduces Osmosis Distillation, a novel model hijacking attack that exploits synthetic datasets generated by dataset distillation methods to compromise deep learning models in transfer learning with high success rates using only a few poisoned samples while maintaining utility on original tasks.

Yuchen Shi, Huajie Chen, Heng Xu, Zhiquan Liu, Jialiang Shen, Chi Liu, Shuai Zhou, Tianqing Zhu, Wanlei Zhou2026-03-06🔒 cs.CR

Causally Robust Reward Learning from Reason-Augmented Preference Feedback

This paper introduces ReCouPLe, a lightweight framework that leverages natural language rationales as causal guidance to train reward models that are robust to spurious correlations and capable of zero-shot transfer to novel tasks, significantly outperforming baselines in reward accuracy and downstream policy performance under distribution shifts.

Minjune Hwang, Yigit Korkmaz, Daniel Seita + 1 more2026-03-06🤖 cs.AI

Interpretable Pre-Release Baseball Pitch Type Anticipation from Broadcast 3D Kinematics

This paper presents a scalable, interpretable framework that achieves 80.4% accuracy in classifying eight professional baseball pitch types using only monocular 3D body kinematics, revealing that upper-body mechanics—particularly wrist position and trunk tilt—are the primary predictors while establishing an empirical ceiling for grip-based distinctions.

Jerrin Bright, Michelle Lu, John Zelek2026-03-06🤖 cs.AI

Differential Privacy in Two-Layer Networks: How DP-SGD Harms Fairness and Robustness

This paper introduces a feature-centric framework demonstrating that the noise required for differential privacy in two-layer neural networks degrades fairness and robustness by disrupting feature learning dynamics, as quantified by the feature-to-noise ratio, while also revealing the limitations of public pre-training strategies under distribution shifts.

Ruichen Xu, Kexin Chen2026-03-06🤖 cs.LG

FedAFD: Multimodal Federated Learning via Adversarial Fusion and Distillation

FedAFD is a unified multimodal federated learning framework that enhances both client and server performance by employing a bi-level adversarial alignment and granularity-aware fusion for personalized local learning, alongside a similarity-guided ensemble distillation mechanism to effectively handle model heterogeneity and modality discrepancies.

Min Tan, Junchao Ma, Yinfu Feng + 6 more2026-03-06🤖 cs.AI

How Does the ReLU Activation Affect the Implicit Bias of Gradient Descent on High-dimensional Neural Network Regression?

This paper demonstrates that for high-dimensional random data, gradient descent on shallow ReLU networks exhibits an implicit bias that approximates the minimum $L_2$ -norm solution with high probability, bridging the gap between worst-case non-existence and exact orthogonality results through a novel primal-dual analysis.

Kuo-Wei Lai, Guanghui Wang, Molei Tao + 1 more2026-03-06🔢 math

U-Parking: Distributed UWB-Assisted Autonomous Parking System with Robust Localization and Intelligent Planning

This paper presents U-Parking, a distributed autonomous parking system that combines Ultra-Wideband-assisted robust localization with Large Language Model-driven intelligent planning to achieve reliable automated parking in challenging indoor environments, as demonstrated on real vehicles.

Yiang Wu, Qiong Wu, Pingyi Fan + 4 more2026-03-06🤖 cs.LG

VPWEM: Non-Markovian Visuomotor Policy with Working and Episodic Memory

This paper introduces VPWEM, a non-Markovian visuomotor policy that combines a sliding window of recent observations with a Transformer-based episodic memory compressor to efficiently retain long-term context for robotic control, achieving significant performance improvements over state-of-the-art baselines on memory-intensive manipulation tasks while maintaining constant computational costs.

Yuheng Lei, Zhixuan Liang, Hongyuan Zhang + 1 more2026-03-06🤖 cs.AI

EVMbench: Evaluating AI Agents on Smart Contract Security

The paper introduces EVMbench, a benchmarking framework that evaluates the capabilities of frontier AI agents in detecting, patching, and exploiting smart contract vulnerabilities within a realistic local Ethereum environment, revealing their ability to successfully execute end-to-end attacks against live blockchain instances.

Justin Wang, Andreas Bigger, Xiaohai Xu, Justin W. Lin, Andy Applebaum, Tejal Patwardhan, Alpin Yukseloglu, Olivia Watkins2026-03-06🔒 cs.CR

BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning

This paper introduces BandPO, a novel reinforcement learning algorithm that replaces PPO's fixed clipping mechanism with a dynamic, probability-aware operator to resolve the exploration bottleneck and entropy collapse caused by suppressing high-advantage low-probability actions, thereby achieving superior stability and performance across diverse models.

Yuan Li, Bo Wang, Yufei Gao + 4 more2026-03-06🤖 cs.AI

Semantic Communication-Enhanced Split Federated Learning for Vehicular Networks: Architecture, Challenges, and Case Study

This paper proposes a Semantic Communication-Enhanced U-Shaped Split Federated Learning (SC-USFL) framework for vehicular networks that integrates a semantic communication module and a network status monitor to reduce communication overhead, enhance label privacy, and adaptively optimize transmission rates under dynamic channel conditions.

Lu Yu, Zheng Chang, Ying-Chang Liang2026-03-06🤖 cs.LG

Person Detection and Tracking from an Overhead Crane LiDAR

This paper addresses the challenge of person detection and tracking from an overhead crane LiDAR by curating a new annotated dataset, evaluating adapted 3D detectors like VoxelNeXt and SECOND with integrated tracking algorithms, and demonstrating high accuracy and real-time feasibility to bridge the gap between standard driving benchmarks and industrial overhead sensing.

Nilusha Jayawickrama, Henrik Toikka, Risto Ojala2026-03-06🤖 cs.LG

$\nabla$ -Reasoner: LLM Reasoning via Test-Time Gradient Descent in Latent Space

This paper introduces $\nabla$ -Reasoner, a novel framework that enhances LLM reasoning by integrating differentiable gradient descent on token logits during inference, thereby shifting from inefficient discrete search to efficient first-order optimization to achieve significant accuracy gains with reduced computational costs.

Peihao Wang, Ruisi Cai, Zhen Wang + 4 more2026-03-06🤖 cs.LG

TimeWarp: Evaluating Web Agents by Revisiting the Past

The paper introduces TimeWarp, a benchmark that evaluates web agents across evolving UI versions to expose their vulnerability to design changes, and proposes TimeTraj, a plan distillation algorithm that significantly improves agent robustness by training on trajectories collected from multiple web versions.

Md Farhan Ishmam, Kenneth Marino2026-03-06🤖 cs.AI

Uncertainty-aware Blood Glucose Prediction from Continuous Glucose Monitoring Data

This study demonstrates that Transformer-based neural networks equipped with evidential output layers outperform LSTM and GRU models in predicting blood glucose and identifying adverse glycemic events for Type 1 diabetes by providing superior accuracy and well-calibrated uncertainty estimates validated on the HUPA-UCM dataset.

Hai Siong Tan2026-03-06✓ Author reviewed ⓘ🔬 physics

WaterSIC: information-theoretically (near) optimal linear layer quantization

This paper introduces WaterSIC, a novel linear layer quantization algorithm that achieves information-theoretically near-optimal performance by allocating different quantization rates to weight columns via a waterfilling strategy, thereby significantly outperforming existing methods like GPTQ and establishing new state-of-the-art results for LLMs across 1 to 4-bit quantization rates.

Egor Lifar, Semyon Savkin, Or Ordentlich + 1 more2026-03-06🔢 math

Replaying pre-training data improves fine-tuning

This paper demonstrates that replaying generic pre-training data during fine-tuning significantly improves performance on target tasks by enhancing data efficiency and preventing catastrophic forgetting, even for less related domains.

Suhas Kotha, Percy Liang2026-03-06🤖 cs.LG

Mixture of Universal Experts: Scaling Virtual Width via Depth-Width Transformation

The paper introduces Mixture of Universal Experts (MOUE), a novel Mixture-of-Experts architecture that scales model capacity by converting depth into "virtual width" through a universal expert pool shared across layers, utilizing a staggered rotational topology and specialized routing mechanisms to overcome scalability limits and outperform traditional MoE baselines.

Yilong Chen, Naibin Gu, Junyuan Shang + 8 more2026-03-06🤖 cs.AI

Functionality-Oriented LLM Merging on the Fisher--Rao Manifold

This paper proposes a functionality-oriented model merging method that computes a weighted Karcher mean on the Fisher--Rao manifold via a practical fixed-point algorithm, effectively overcoming the representation collapse and scalability limitations of existing Euclidean-space approaches when combining multiple heterogeneous LLMs.

Jiayu Wang, Zuojun Ye, Wenpeng Yin2026-03-06🤖 cs.LG

← Previous Next →

cs.LG