Hybrid Energy-Aware Reward Shaping: A Unified Lightweight Physics-Guided Methodology for Policy Optimization
この論文は、物理モデルの完全な方程式を必要とせず、ポテンシャルに基づく報酬整形とエネルギー感知型行動正則化を統合した軽量な手法「H-EARS」を提案し、深層強化学習の収束性、安定性、およびエネルギー効率を向上させる理論的基盤と実証結果を示しています。
Qijun Liao (School of Mechanical Engineering, University of Science and Technology Beijing, China), Jue Yang (School of Mechanical Engineering, University of Science and Technology Beijing, China), Yiting Kang (School of Mechanical Engineering, University of Science and Technology Beijing, China), Xinxin Zhao (School of Mechanical Engineering, University of Science and Technology Beijing, China), Yong Zhang (Jiangsu XCMG Construction Machinery Research Institute Co., Ltd., China), Mingan Zhao (Jiangsu XCMG Construction Machinery Research Institute Co., Ltd., China)2026-03-13🤖 cs.LG