Global Convergence of Iteratively Reweighted Least Squares for Robust Subspace Recovery

This paper establishes the first global linear convergence guarantees for a dynamic smoothing variant of Iteratively Reweighted Least Squares (IRLS) in robust subspace and affine subspace recovery, extending these theoretical results to nonconvex optimization on Riemannian manifolds and demonstrating their practical utility in low-dimensional neural network training.

Gilad Lerman, Kang Li, Tyler Maunu, Teng ZhangWed, 11 Ma🤖 cs.LG

On the Width Scaling of Neural Optimizers Under Matrix Operator Norms I: Row/Column Normalization and Hyperparameter Transfer

This paper introduces a family of mean-normalized matrix operator norms to derive width-independent smoothness bounds for deep neural networks, leading to the development of MOGA, a row/column-normalized optimizer that enables stable hyperparameter transfer across model widths and outperforms Muon in speed while maintaining competitive performance.

Ruihan Xu, Jiajin Li, Yiping LuWed, 11 Ma🤖 cs.LG

Empirical universality and non-universality of local dynamics in the Sherrington-Kirkpatrick model

This paper empirically demonstrates that while the runtime of local greedy search for optimizing Sherrington-Kirkpatrick spin glass Hamiltonians is universal across various coupling distributions, the performance of Parisi's local reluctant search is surprisingly non-universal and sensitive to the specific entry distribution, particularly when couplings have discrete support.

Grace Liu, Dmitriy KuniskyTue, 10 Ma🔢 math

Robustness to Model Approximation, Model Learning From Data, and Sample Complexity in Wasserstein Regular MDPs

This paper establishes robustness bounds for discrete-time stochastic optimal control under Wasserstein model approximation, demonstrating that the performance loss of policies derived from approximate models is controlled by the Wasserstein-1 distance between transition kernels, thereby enabling rigorous sample complexity analysis for empirical model and noise distribution learning where stronger convergence criteria may fail.

Yichen Zhou, Yanglei Song, Serdar YükselTue, 10 Ma🔢 math

Optimal Consumption and Portfolio Choice with No-Borrowing Constraint in the Kim-Omberg Model

This paper solves an intertemporal utility maximization problem with a no-borrowing constraint and stochastic excess returns in the Kim-Omberg framework by employing Lagrange duality to transform the primal problem into a dual singular control problem, which is then characterized via an auxiliary two-dimensional optimal stopping problem to derive optimal consumption and portfolio strategies.

Giorgio Ferrari, Tim Niclas SchützTue, 10 Ma🔢 math

New Results on the Polyak Stepsize: Tight Convergence Analysis and Universal Function Classes

This paper establishes the tightness of known convergence rates for the Polyak stepsize in gradient descent by constructing worst-case functions, demonstrates its ability to escape worst-case scenarios via floating-point errors, and proves its universality across diverse function classes including those with Hölder smoothness and growth conditions.

Chang He, Wenzhi Gao, Bo Jiang, Madeleine Udell, Shuzhong ZhangTue, 10 Ma🔢 math

Radial and Non-Radial Solution Structures for Quasilinear Hamilton--Jacobi--Bellman Equations in Bounded Settings

This paper establishes the existence, uniqueness, and global regularity of positive classical solutions to quasilinear Hamilton–Jacobi–Bellman equations on bounded convex domains via a constructive weighted monotone iteration scheme, while providing a probabilistic derivation from controlled Itô diffusions and demonstrating applications in stochastic production planning and image restoration.

Dragos-Patru CoveiTue, 10 Ma🔢 math