SafetyRank: comparison of machine learning techniques for industrial safety alerts classification

  • Wander Fernandes Junior IFES
  • Karin Komati IFES
  • Kelly Gazolli UFES

Abstract


In industrial areas the issuance of safety alerts after the occurrence of accidents is common. In this context, this work proposes a comparison of machine learning techniques for text classification, using a database prepared by the authors with safety alerts extracted from public documents obtained from the internet. Classical classifiers, KNN, SVM, Naive Bayes, Decision Tree, and Random Forest, were applied to the dataset. The best accuracy obtained by SVM with 79% followed by Random Forest with 75%. The results encourage further work, as a larger public database of accidents and safety alerts can increase the dissemination of knowledge and make it possible to reduce accidents at work.

Keywords: text classification, industrial safety, machine learning

References

Aggarwal, C. C. and Zhai, C. (2012). A Survey of Text Classification Algorithms, pages 163–222. Springer US, Boston, MA.

Allahyari, M., Pouriyeh, S., Assefi, M., Safaei, S., Trippe, E. D., Gutierrez, J. B., and Kochut, K. (2017). A brief survey of text mining: Classification, clustering and extraction techniques. arXiv preprint arXiv:1707.02919.

ANP (2018). Alertas de segurança. Acessado em 18 dez. 2019.

BSEE (2019). Bureau of safety and environmental enforcement. Acessado em 18 dez. 2019.

CCPS (2019). Process safety beacon. Acessado em 18 dez. 2019.

Ceravolo, I., Brasil, A. A., and Komati, K. (2019). Classifying readers with dyslexia from eye movements using machine learning and wavelets. In ENIAC 2019.

Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep learning. MIT press.

IADC (2019). International association of drilling contractors - safety alerts. Acessado em 18 dez. 2019.

IOGP (2019). International association of oil and gas producers. Acessado em 18 dez. 2019.

Jimenez, S., Gonzalez, F. A., and Gelbukh, A. (2016). Mathematical properties of soft cardinality: Enhancing jaccard, dice and cosine similarity measures with element-wise distance. Information Sciences, 367:373–389.

Kadhim, A. I. (2019). Survey on supervised machine learning techniques for automatic text classification. Artificial Intelligence Review, 52(1):273–292.

Kletz, T. A. (1993). Lessons from disaster: how organizations have no memory and accidents recur. IChemE.

Kowsari, K., Jafari Meimandi, K., Heidarysafa, M., Mendu, S., Barnes, L., and Brown, D. (2019). Text classification algorithms: A survey. Information, 10(4):150.

Lorena, A. C., Gama, J., and Faceli, K. (2000). Inteligencia Artificial. LTC, Rio de Janeiro.

Neto, W. B. d. R., Jr., J. M. P. d. M., and Souza, R. V. L. (2017). Analise de dados obtidos atraves de um sistema de telemetria automotivo utilizando k-nn. XIV Encontro Nacional de Inteligencia Artificial e Computacional, pages 960–971.

NOPSEMA (2019). National offshore petroleum safety and environmental management authority. Acessado em 18 dez. 2019.

Pal, M. (2008). Multiclass approaches for support vector machine based land cover classification. arXiv preprint arXiv:0802.2411.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12.

Petrobras (2018). Sustentabilidade 2018. Acessado em 18 dez. 2019.

Rasmussen, C. E. (2003). Gaussian processes in machine learning. In Summer School on Machine Learning, pages 63–71. Springer

Rodrigues, D. S. et al. (2018). A comparative analysis of loan requests classification algorithms in a peer-to-peer lending platform. In Proceedings of the XIV Brazilian Symposium on Information Systems, page 42. ACM.

SmartLab (2019). Observatorio de segurança e saúde no trabalho. Acessado em 18 dez. 2019.

Sun, A., Lim, E.-P., and Liu, Y. (2009). On strategies for imbalanced text classification using svm: A comparative study. Decision Support Systems, 48(1):191–201.

USCG (2019). United states coast guard - safety alerts. Acessado em 18 dez. 2019.

Zhai, C. and Massung, S. (2016). Text data management and analysis: a practical introduction to information retrieval and text mining. Morgan & Claypool.
Published
2020-06-30
FERNANDES JUNIOR, Wander; KOMATI, Karin; GAZOLLI, Kelly. SafetyRank: comparison of machine learning techniques for industrial safety alerts classification. In: NATIONAL COMPUTING MEETING OF FEDERAL INSTITUTES (ENCOMPIF), 7. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 37-44. ISSN 2763-8766. DOI: https://doi.org/10.5753/encompif.2020.11066.