Applying reinforcement learning to optical cavity locking tasks: considerations on actor-critic architectures and real-time hardware implementation
This paper presents a study on applying deep reinforcement learning, specifically Deep Deterministic Policy Gradient within a custom Gymnasium environment, to achieve autonomous locking of Fabry-Perot optical cavities in non-linear regimes for gravitational-wave detectors, while also discussing architectural improvements and strategies for real-time hardware implementation.