Global Convergence of Iteratively Reweighted Least Squares for Robust Subspace Recovery

This paper establishes the first global linear convergence guarantees for a dynamic smoothing variant of Iteratively Reweighted Least Squares (IRLS) in robust subspace and affine subspace recovery, extending these theoretical results to nonconvex optimization on Riemannian manifolds and demonstrating their practical utility in low-dimensional neural network training.

Gilad Lerman, Kang Li, Tyler Maunu, Teng ZhangWed, 11 Ma🤖 cs.LG

Regret-Optimal Q-Learning with Low Cost for Single-Agent and Federated Reinforcement Learning

This paper introduces two novel model-free algorithms, Q-EarlySettled-LowCost and FedQ-EarlySettled-LowCost, for single-agent and federated reinforcement learning that simultaneously achieve near-optimal regret, linear burn-in costs in state and action spaces, and logarithmic policy switching or communication costs, while also providing improved gap-dependent theoretical guarantees.

Haochen Zhang, Zhong Zheng, Lingzhou XueWed, 11 Ma🤖 cs.LG

Prognostics for Autonomous Deep-Space Habitat Health Management under Multiple Unknown Failure Modes

This paper proposes an unsupervised prognostics framework that utilizes unlabeled run-to-failure data to simultaneously identify latent failure modes and select informative sensors, thereby enabling accurate remaining useful life prediction for autonomous deep-space habitats under multiple unknown failure conditions.

Benjamin Peters, Ayush Mohanty, Xiaolei Fang, Stephen K. Robinson, Nagi GebraeelWed, 11 Ma🤖 cs.LG

What Do We Care About in Bandits with Noncompliance? BRACE: Bandits with Recommendations, Abstention, and Certified Effects

This paper introduces BRACE, a parameter-free algorithm for multi-armed bandits with noncompliance that simultaneously optimizes recommendation welfare and treatment learning by performing certified instrumental variable inversion only when identification is strong, otherwise providing honest structural intervals to navigate the trade-offs between mediated and direct-control regimes.

Nicolás Della PennaWed, 11 Ma🤖 cs.LG

Verifying Good Regulator Conditions for Hypergraph Observers: Natural Gradient Learning from Causal Invariance via Established Theorems

This paper verifies that persistent observers in causally invariant hypergraph substrates satisfy the Conant-Ashby Good Regulator Theorem, thereby necessitating internal models that lead to natural gradient descent as the unique learning rule and yielding a model-dependent closed-form formula for Vanchurin's regime parameter α\alpha with a quantum-classical threshold at κ(F)=2\kappa(F)=2.

Max ZhuravlevWed, 11 Ma🤖 cs.LG