Practical Regularized Quasi-Newton Methods with Inexact Function Values

This paper proposes a noise-tolerant regularized quasi-Newton method with a relaxed Armijo-type line search that achieves a global convergence rate of O(1/ε2)\mathcal{O}(1/\varepsilon^2) for smooth nonconvex optimization under inexact function values, demonstrating superior robustness and competitive efficiency in experiments involving both artificial noise and low-precision arithmetic.

Hiroki Hamaguchi, Naoki Marumo, Akiko Takeda

Published Thu, 12 Ma
📖 5 min read🧠 Deep dive

Imagine you are trying to find the lowest point in a vast, foggy valley (this is your optimization problem). Your goal is to get to the bottom as quickly as possible.

In the world of computer science, this is called minimizing a function. Usually, you have a "guide" (an algorithm) that tells you which way is down. The most popular guides are called Quasi-Newton methods (like L-BFGS). They are like expert hikers who not only look at the slope under their feet but also remember the shape of the terrain they've walked over to predict the best path forward. They are fast and efficient.

The Problem: The Fog of Noise
However, in the real world, things aren't perfect. Sometimes, your guide's map is blurry, or the compass is slightly off. This is numerical noise. It happens when computers use limited precision (like 16-bit or 32-bit numbers instead of super-precise 64-bit ones) or when simulations are inherently messy.

When the guide tries to use standard rules to decide how far to step, the "noise" makes the map look like a jagged, confusing mess. The guide might think a small hill is a deep valley, or vice versa. This causes the hiker to:

  1. Take steps that are too big or too small.
  2. Get confused and stop walking prematurely.
  3. Wander in circles, never reaching the bottom.

The Solution: A New Kind of Guide
The authors of this paper, Hamaguchi, Marumo, and Takeda, built a new, noise-tolerant guide. They call it a Regularized Quasi-Newton Method.

Here is how their new guide works, using simple analogies:

1. The "Safety Net" (Regularization)

Imagine your expert hiker is about to take a giant leap based on a shaky map. In a normal situation, they would leap. But in this new method, the hiker wears a safety net (called regularization).

  • How it works: If the map looks too confusing (too much noise), the safety net tightens. It forces the hiker to take smaller, more cautious steps. It prevents the hiker from making a catastrophic mistake based on bad data.
  • The Magic: The guide is smart enough to know when to loosen the net. If the map looks clear, the net disappears, and the hiker can sprint again using the fast, standard Quasi-Newton strategy.

2. The "Fuzzy Rulebook" (Relaxed Line Search)

Standard hikers have a strict rule: "You must go down at least 5 meters to take a step." If the map is noisy, the hiker might think they went down 5 meters when they actually went up, or vice versa. They get stuck.
The new guide uses a Fuzzy Rulebook.

  • How it works: Instead of demanding a perfect 5-meter drop, the rulebook says, "If you go down almost 5 meters, or if the noise makes it look like you went down, that's okay." It absorbs the error. It allows the hiker to keep moving even when the data is slightly "noisy," as long as the general direction is correct.

3. The "Memory of Mistakes" (Adaptive Scaling)

Sometimes, the noise is so bad that the guide needs to switch tactics entirely. The new guide has a backup plan inspired by a method called AdaGrad.

  • How it works: If the guide realizes the terrain is too chaotic to trust the map, it switches to a "blind but steady" mode. It remembers how much it has walked so far and adjusts its step size based on the total history of its journey, rather than trying to guess the immediate slope. This ensures that even in a storm, the hiker keeps making slow, steady progress toward the bottom.

The Results: Why It Matters

The authors tested this new guide on hundreds of problems (the CUTEst benchmark) and in different "weather conditions":

  • Perfect Weather (64-bit precision): The new guide is just as fast as the old, standard guides. It doesn't slow you down when things are easy.
  • Foggy Weather (32-bit and 16-bit precision): This is where the magic happens. Standard guides often get lost, stop, or fail. The new guide, with its safety net and fuzzy rulebook, keeps walking steadily and finds the bottom reliably.
  • Artificial Noise: Even when they added random static to the data, the new guide outperformed everyone else.

The Bottom Line

Think of this paper as inventing a self-driving car that can drive safely in a blizzard.

  • Old cars (standard algorithms) rely on perfect GPS and clear roads. If the GPS glitches, they crash or stop.
  • This new car (the proposed method) has sensors that know the GPS is glitching. It automatically slows down, trusts its other sensors, and keeps driving safely to the destination, even when the road is messy.

It proves that you don't have to sacrifice speed for stability. You can have a car that drives fast on the highway but knows exactly how to handle the snow.