Relative Humidity Prediction using XGBoost Machine Learning Model, Case Study: Bajgah Climatological Station, Iran
Subject Areas : Research PaperReza Piraei 1 , Ali Mohammadi 2 , Seied Hosein Afzali 3
1 - PhD Student of Water Recourses Management, Department of Civil and Environmental Engineering, Shiraz University, Shiraz, Iran
2 - MSc Student of Water Recourses Management, School of Civil and Environmental Engineering, Tarbiat Modares University, Tehran, Iran
3 - Associate Prof. of Civil Engineering, Department of Civil and Environmental Engineering, Shiraz University, Shiraz, Iran
Keywords: Bajgah, Machine Learning, Relative Humidity, XGBoost,
Abstract :
Introduction: Relative humidity is one of the most important hydrological parameters that significantly influences evapotranspiration water resource management, plant growth and even concrete settings. Hence, accurate prediction and estimation of relative humidity paramount importance.
Methods: In this study, since two parameters relative humidity and the minimum and maximum temperatures of preceding days, have the most significant impact on predicting future relative humidity, and given the prevalence of available data for only these two parameters in many parts of the country, various scenarios involving these parameters were studied. The best scenario for predicting relative humidity was obtained using the XGBoost model. To assess the accuracy of the model, the Bajgah region in Fars Province was chosen as a case study, and the accuracy of different scenarios was compared using data from the past 30 years (1993 to 2023). In this regard, missing data were estimated using the KNN Imputer model. The correlation between mean relative humidity of one to ten days before and the target variable (predicted relative humidity on day t) was calculated using Pearson correlation. Based on the results indicating the insignificance of data from the fourth day and earlier, data from one to three days before were utilized.
Findings and Conclusion: Finally, by comparing the results based on six statistical criteria (RMSE, MAE, MARE, MXARE, NSE, and R2), it was determined the scenario based on relative humidity and the maximum and minimum temperatures of the preceding 3 days provides the best estimation.
1. Khatibi, R., L. Naghipour, M.A. Ghorbani, and M.T. Aalami, Predictability of relative humidity by two artificial intelligence techniques using noisy data from two Californian gauging stations. Neural Computing and Applications, 2013. 23(7): p. 2241-2252.
2. Tao, H., S.M. Awadh, S.Q. Salih, S.S. Shafik, and Z.M. Yaseen, Integration of extreme gradient boosting feature selection approach with machine learning models: application of weather relative humidity prediction. Neural Computing and Applications, 2022. 34(1): p. 515-533.
3. Allen, R.G., L.S. Pereira, D. Raes, and M. Smith, Crop evapotranspiration-Guidelines for computing crop water requirements-FAO Irrigation and drainage paper 56. Fao, Rome, 1998. 300(9): p. D05109.
4. Fan, J., et al., Evaluation of SVM, ELM and four tree-based ensemble models for predicting daily reference evapotranspiration using limited meteorological data in different climates of China. Agricultural and Forest Meteorology, 2018. 263: p. 225-241.
5. Ferreira, L.B. and F.F. da Cunha, New approach to estimate daily reference evapotranspiration based on hourly temperature and relative humidity using machine learning and deep learning. Agricultural Water Management, 2020. 234: p. 106113.
6. Bellido-Jiménez, J.A., J. Estévez, and A.P. García-Marín, New machine learning approaches to improve reference evapotranspiration estimates using intra-daily temperature-based variables in a semi-arid region of Spain. Agricultural Water Management, 2021. 245: p. 106558.
7. Abdallah, M., et al., Reference evapotranspiration estimation in hyper-arid regions via D-vine copula based-quantile regression and comparison with empirical approaches and machine learning models. Journal of Hydrology: Regional Studies, 2022. 44: p. 101259.
8. Bayatvarkeshi, M., K. Mohammadi, O. Kisi, and R. Fasihi, A new wavelet conjunction approach for estimation of relative humidity: wavelet principal component analysis combined with ANN. Neural Computing and Applications, 2020. 32(9): p. 4989-5000.
9. Merabet, K. and S. Heddam, Improving the accuracy of air relative humidity prediction using hybrid machine learning based on empirical mode decomposition: a comparative study. Environmental Science and Pollution Research, 2023. 30(21): p. 60868-60889.
10. Gezgen, D., Comparison of missing data imputation methods applied to daily temperature and precipitation data in Turkey. 2023, Middle East Technical University.
11. Bisong, E., Building machine learning and deep learning models on Google cloud platform. 2019: Springer.
12. Bandara, A., et al. A generalized ensemble machine learning approach for landslide susceptibility modeling. in Data Management, Analytics and Innovation: Proceedings of ICDMAI 2019, Volume 2. 2020. Springer.
13. Lu, H. and X. Ma, Hybrid decision tree-based machine learning models for short-term water quality prediction. Chemosphere, 2020. 249: p. 126169.
14. Katipoğlu, O.M. and M. Sarıgöl, Prediction of flood routing results in the Central Anatolian region of Türkiye with various machine learning models. Stochastic Environmental Research and Risk Assessment, 2023: p. 1-20.
15. Han, Y., et al., Coupling a bat algorithm with xgboost to estimate reference evapotranspiration in the arid and semiarid regions of china. Advances in Meteorology, 2019. 2019: p. 1-16.
16. Piraei, R., S.H. Afzali, and M. Niazkar, Assessment of XGBoost to Estimate Total Sediment Loads in Rivers. Water Resources Management, 2023.
17. Piraei, R., M. Niazkar, and S.H. Afzali, Assessment of data-driven models for estimating total sediment discharge. Earth Science Informatics, 2023. 16(3): p. 2795-2812.