From ARIMA to Attention: Power Load Forecasting Using Temporal Deep Learning
This paper empirically demonstrates that a Transformer model utilizing self-attention mechanisms outperforms traditional ARIMA and recurrent neural network approaches (LSTM, BiLSTM) in short-term power load forecasting on PJM data, achieving a superior 3.8% MAPE and highlighting the effectiveness of attention-based architectures for capturing complex temporal patterns.