Comparative Machine Learning and Time Series Forecasting of Wind Power Output using SCADA Data
DOI:
https://doi.org/10.32996/jcsts.2025.7.7.2Keywords:
Machine Learning, Time Series, Wind Power Forecasting, SCADA, Output PredictionAbstract
The rapid growth of wind energy as a renewable power source has presented both opportunities and operational challenges due to its inherent intermittency and non-linearity. Accurate short-term forecasting of wind power output is vital for improving grid reliability, reducing operating costs, and enhancing energy market efficiency. This study presents a comprehensive comparative analysis of traditional time series models and modern machine learning algorithms for short-term wind power forecasting. Using high-frequency SCADA data from a commercial wind turbine in Turkey, we evaluate the predictive performance of nine models: Seasonal Naive, Exponential Smoothing (ETS), ARIMA, ARIMAX, Dynamic Harmonic Regression, Linear Regression, Gradient Boosted Trees (GBM), and Generalized Additive Models (GAM). Root Mean Square Error (RMSE) is employed as the primary accuracy metric on a hold-out test set. Among the models analyzed, the Gradient Boosted Tree model demonstrated superior performance with the lowest RMSE of 16.05, followed by ARIMAX (RMSE = 20.8), GAM (RMSE = 29.7), and linear regression using external meteorological inputs (RMSE = 29.9). Traditional statistical models such as ETS and ARIMA showed comparatively lower performance, particularly in handling non-linear patterns and multiple seasonalities. The study highlights the importance of integrating exogenous variables—such as wind speed, direction, and theoretical power curve—into forecasting frameworks to capture complex relationships and improve accuracy. Our findings emphasize that ensemble learning methods and hybrid statistical-ML approaches offer meaningful advancements in renewable energy forecasting. These insights provide valuable guidance for energy planners, system operators, and stakeholders seeking to optimize renewable integration and grid stability. Future work can explore deep learning architectures and spatiotemporal models across multi-turbine datasets for broader applicability.