Hybrid Energy-Aware Reward Shaping: A Unified Lightweight Physics-Guided Methodology for Policy Optimization
Este estudio presenta H-EARS, un método unificado y ligero que combina el modelado de recompensas basado en potencial con la regularización de acciones consciente de la energía para acelerar la convergencia y mejorar la eficiencia energética en el aprendizaje por refuerzo profundo, sin requerir modelos dinámicos completos.
Qijun Liao (School of Mechanical Engineering, University of Science and Technology Beijing, China), Jue Yang (School of Mechanical Engineering, University of Science and Technology Beijing, China), Yiting Kang (School of Mechanical Engineering, University of Science and Technology Beijing, China), Xinxin Zhao (School of Mechanical Engineering, University of Science and Technology Beijing, China), Yong Zhang (Jiangsu XCMG Construction Machinery Research Institute Co., Ltd., China), Mingan Zhao (Jiangsu XCMG Construction Machinery Research Institute Co., Ltd., China)2026-03-13🤖 cs.LG