Evaluating ML models for lightning forecasting in Brazil


Instruments for monitoring severe meteorological phenomena (such as lightning, flooding and landslides) can be used to assist in decision-making by state agencies, in an attempt to mitigate their possible harmful effects. These phenomena usually occur suddenly on a short-term duration, under a limited region, imposing difficulties in being predicted by regular weather forecast models, requiring specific prediction systems. Very short-term weather forecasting systems, on order of a few hours, known as nowcasting, can include numerical models of physical phenomena and machine learning algorithms. This work presents a system for forecasting the incidence of lightning, a common phenomenon in electrically active storms, through the application and evaluation of two machine learning models, an Artificial Neural Network and a Random Forest model, which were able to detect the occurrence of atmospheric electrical discharges from the automatic recognition of patterns obtained from the data generated by the numerical weather forecasts. The Random Forest model presented the best results when trained with the set that includes the ten best correlated variables, reaching 99.77% of accuracy for the case study performed.

Palavras-chave: Machine learning, lightning forecasting, severe weather


Cotton, W. R., Bryan, G. H., and C., V. d. H. S. (2010). Storm and cloud dynamics.Elsevier, 2 edition.

Faceli, K., Lorena, A. C., Gama, J., and Carvalho, A. C. P. d. L. F. d. (2011). Inteligencia artificial: uma abordagem de aprendizado de máquina. LTC.

Haykin, S. (2001). Redes Neurais-Principios e Praticas. Bookman.

Ho, T. K. (1995). Random decision forests. Proceedings of 3rd International Conference on Document Analysis and Recognition.

Leinonen, J., Hamann, U., Germann, U., and Mecikalski, J. R. (2022). Nowcasting thunderstorm hazards using machine learning: the impact of data sources on performance. Natural Hazards and Earth System Sciences, 22(2):577-597.

Luxburg, U. V. and Scholkopf, B. (2011). Statistical learning theory: Models, concepts, and results. Handbook of the History of Logic, page 651-706.

MacQueen, J. B. (1967). Some methods for classification and analysis of Multivariate Observations, volume 1. Defense Technical Information Center.

NCAR (1990). Weather research and forecasting model. https://www.mmm.ucar.edu/weather-research-and-forecasting-model.

NOAA/NESDIS (2012). Glm lightning cluster-filter algorithm. [link].

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., and et al (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825-2830.

Rakov, V. A. and Uman, M. A. (2003). Lightning: physics and effects. Cambridge University, page 687.

Rew, R. K. and Davis, G. P. (1990). NetCDF: An Interface for Scientific Data Access, 10(4):76-82.

Sakuragi, J. (2017). Estudo da morfologia das tempestades severas em 3D e potencial aplicaçao em Nowcasting. PhD thesis, Instituto Nacional de Pesquisas Espaciais-INPE.

WMO (2017). Guidelines for Nowcasting Techniques. World Meteorological Organization.
BASSANELLI PEREIRA, Arielle dos Santos; FAZENDA, Álvaro L.; CALHEIROS, Alan James Peixoto. Evaluating ML models for lightning forecasting in Brazil. In: WORKSHOP ON DATA-DRIVEN EXTREME EVENTS ANALYTICS (DEXEA) - SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 37. , 2022, Búzios. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 187-192. DOI: https://doi.org/10.5753/sbbd_estendido.2022.21863.