Hybrid Energy-Aware Reward Shaping: A Unified Lightweight Physics-Guided Methodology for Policy Optimization
Este artigo apresenta o H-EARS, uma metodologia unificada e leve que combina o modelamento de recompensas baseado em potencial com regularização de ação consciente de energia para otimizar políticas de aprendizado por reforço, garantindo convergência acelerada e eficiência energética sem exigir modelos dinâmicos completos.
Qijun Liao (School of Mechanical Engineering, University of Science and Technology Beijing, China), Jue Yang (School of Mechanical Engineering, University of Science and Technology Beijing, China), Yiting Kang (School of Mechanical Engineering, University of Science and Technology Beijing, China), Xinxin Zhao (School of Mechanical Engineering, University of Science and Technology Beijing, China), Yong Zhang (Jiangsu XCMG Construction Machinery Research Institute Co., Ltd., China), Mingan Zhao (Jiangsu XCMG Construction Machinery Research Institute Co., Ltd., China)2026-03-13🤖 cs.LG