On the Learning Dynamics of Two-layer Linear Networks with Label Noise SGD

该论文通过理论分析与实验验证,揭示了标签噪声 SGD 在两层过参数化线性网络中通过驱动模型从“懒惰”区域向“丰富”区域转变并增强权重与真实插值器的对齐,从而解释了其提升泛化能力的内在机制,并将该发现推广至锐度感知最小化(SAM)等更广泛的优化算法。

Tongcheng Zhang, Zhanpeng Zhou, Mingze Wang, Andi Han, Wei Huang, Taiji Suzuki, Junchi Yan2026-03-12🤖 cs.LG

The Curse and Blessing of Mean Bias in FP4-Quantized LLM Training

该论文指出大语言模型低比特训练中的数值不稳定性主要由秩一均值偏差驱动,并提出通过简单的均值减法消除该偏差,从而在无需复杂 SVD 分解的情况下显著提升了 FP4 量化训练的稳定性与性能。

Hengjie Cao, Zhendong Huang, Mengyi Chen, Yifeng Yang, Fanqi Yu, Ruijun Huang, Fang Dong, Xin Zhang, Jixian Zhou, Anrui Chen, Mingzhi Dong, Yujiang Wang, Jinlong Hou, Qin Lv, Yuan Cheng, Tun Lu, Fan Yang, Li Shang2026-03-12🤖 cs.LG

Spatio-Temporal Forecasting of Retaining Wall Deformation: Mitigating Error Accumulation via Multi-Resolution ConvLSTM Stacking Ensemble

该研究提出了一种多分辨率 ConvLSTM 集成框架,通过融合不同时间尺度的输入数据,有效缓解了误差累积问题,显著提升了基坑开挖过程中挡土墙变形的长时序预测精度与稳定性。

Jihoon Kim (Department of Civil,Environmental Engineering, Hongik University, Seoul, Republic of Korea), Heejung Youn (Department of Civil,Environmental Engineering, Hongik University, Seoul, Republic of Korea)2026-03-12🤖 cs.LG

Beam-Plasma Collective Oscillations in Intense Charged-Particle Beams: Dielectric Response Theory, Langmuir Wave Dispersion, and Unsupervised Detection via Prometheus

该论文通过建立基于 Vlasov-Poisson 系统的动力学场论框架推导了强流带电粒子束的朗缪尔波色散关系,并利用 Prometheus 无监督学习模型验证了等离子体频率、异常束展宽及弗里德尔振荡等集体振荡特征。

Brandon Yee, Wilson Collins, Michael Iofin, Jiayi Fu2026-03-12🔬 physics

Muscle Synergy Priors Enhance Biomechanical Fidelity in Predictive Musculoskeletal Locomotion Simulation

该研究提出了一种将肌肉协同先验嵌入强化学习的生理信息框架,通过低维协同基约束控制,显著提升了预测性肌骨步态模拟在不同速度、坡度和地形下的生物力学保真度与泛化能力。

Ilseung Park (Carnegie Mellon University), Eunsik Choi (Seoul National University), Jangwhan Ahn (UNC-Chapel Hill and NC State University), Jooeun Ahn (Seoul National University)2026-03-12🤖 cs.LG

VERI-DPO: Evidence-Aware Alignment for Clinical Summarization via Claim Verification and Direct Preference Optimization

本文提出了 VERI-DPO 框架,通过利用声明验证器从检索增强证据中挖掘偏好数据并结合直接偏好优化(DPO)技术,显著提升了临床摘要的忠实度,将不支持的声明率从 10.7% 大幅降低至 1.9%。

Weixin Liu, Congning Ni, Qingyuan Song, Susannah L. Rose, Christopher Symons, Murat Kantarcioglu, Bradley A. Malin, Zhijun Yin2026-03-12💬 cs.CL

Resource-constrained Amazons chess decision framework integrating large language models and graph attention

该论文提出了一种将图注意力自编码器与大型语言模型(GPT-4o-mini)相结合的轻量级混合框架,通过利用结构推理对 LLM 生成数据进行去噪并优化蒙特卡洛树搜索,在资源受限条件下实现了在亚马逊棋游戏中超越基线及教师模型的高性能决策。

Tianhao Qian, Zhuoxuan Li, Jinde Cao, Xinli Shi, Hanjie Liu, Leszek Rutkowski2026-03-12🤖 cs.AI

IH-Challenge: A Training Dataset to Improve Instruction Hierarchy on Frontier LLMs

该论文提出了名为 IH-Challenge 的强化学习训练数据集,旨在解决大语言模型指令层级冲突的鲁棒性难题,通过微调显著提升了模型在对抗攻击下的安全性与指令遵循能力,并开源了该数据集以推动相关研究。

Chuan Guo (Michael Pokorny), Juan Felipe Ceron Uribe (Michael Pokorny), Sicheng Zhu (Michael Pokorny), Christopher A. Choquette-Choo (Michael Pokorny), Steph Lin (Michael Pokorny), Nikhil Kandpal (Michael Pokorny), Milad Nasr (Michael Pokorny), Rai (Michael Pokorny), Sam Toyer, Miles Wang, Yaodong Yu, Alex Beutel, Kai Xiao2026-03-12🤖 cs.AI