Using Machine Learning to Classify Process Model Elements for Process Infrastructure Analysis


Context: Business Process Management is an increasingly important discipline adopted by organizations to model, analyze, and implement their business processes. The complexity of understanding business processes is reduced when they are analyzed through process models. Problem: While many techniques assist and automate process modeling, the same is different for process analysis, which may reduce the effectiveness and consistency of BPM in improving business processes. Solution: This paper reports on our methodology to train a machine learning algorithm to automatically analyze process model elements (e.g., activities, swimlanes, data objects) and classify them according to three categories of infrastructure information: (i) process participants; (ii) processed documents and information; (iii) systems, technologies and tools. IS Theory: This work was conceived under the aegis of Organizational Learning Theory. The training of machine learning models seeks improvement over previous analysis methods through experimentation. Method: The research described in this paper is prescriptive and quantitative, organized through four phases: (i) model collection; (ii) dataset building; (iii) data evaluation; (iv) machine learning classifier development. Summary of Results: From a collection of 85 process models, three training datasets were created, with an average of 480 process activities each. We obtained accuracies between 88% and 96% depending on the category analyzed. Contributions and Impact in the IS area: The main contribution of this research is the methodology developed to help automate business process analysis through machine learning training datasets and models. We expect this approach to assist in achieving more consistent results in the analysis of large process architectures.
Palavras-chave: business process management, process model analysis, infrastructure, machine learning, text labeling


Steven Alter. 2013. Work System Theory: Overview of Core Concepts, Extensions, and Challenges for the Future. Journal of the Association for Information Systems 14, 2 (feb 2013), 1.

Diego Toralles Avila, Rubens Ideron dos Santos, Jan Mendling, and Lucineia Heloisa Thom. 2020. A systematic literature review of process modeling guidelines and their empirical support. Business Process Management Journal 27, 1 (Nov 2020), 1–23.

Marcelo Balbinot, Lucinéia Thom, and Marcelo Fantinato. 2017. Identificando Fontes de Dados em Modelos de Processos de Negócio com base em Elementos de BPMN. In Proceedings of the 13th Brazilian Symposium on Information Systems, SBSI 2017, Lavras, Brazil, June 5-8, 2017 (Lavras), Heitor A. X. Costa, Juliana Galvani Greghi, José Maria N. David, and André Pimenta Freire (Eds.). SBC, Porto Alegre, RS, Brasil, 444–451.

Miller Biazus, Carlos Habekost dos Santos, Larissa Narumi Takeda, José Palazzo Moreira de Oliveira, Marcelo Fantinato, Jan Mendling, and Lucinéia Heloisa Thom. 2019. Software Resource Recommendation for Process Execution Based on the Organization's Profile. In Database and Expert Systems Applications, Sven Hartmann, Josef Küng, Sharma Chakravarthy, Gabriele Anderst-Kotsis, A Min Tjoa, and Ismail Khalil (Eds.). Springer International Publishing, Cham, 118–128.

Timm Caporale. 2016. A Tool for Natural Language Oriented Business Process Modeling. In Proceedings of the 8th ZEUS Workshop, Vienna, Austria, January 27-28, 2016(CEUR Workshop Proceedings, Vol. 1562), Christoph Hochreiner and Stefan Schulte (Eds.)., 49–52.

Nicolas Mauro de Moreira Bohnenberger, Alessandra Ceolin Schmitt, and Lucinéia Heloisa Thom. 2021. Discovering Healthcare Processes from Natural Language Documents: a case study on COVID-19. In 25th Pacific Asia Conference on Information Systems, PACIS 2021, Virtual Event / Dubai, UAE, July 12-14, 2021, Doug Vogel, Kathy Ning Shen, Pan Shan Ling, M. N. Ravishankar, and Jacky Xi Zhang (Eds.). 175.

Marlon Dumas, Marcello La Rosa, Jan Mendling, and Hajo A. Reijers. 2018. Fundamentals of Business Process Management, Second Edition. Springer.

Dirk Fahland and Wil M.P. Van Der Aalst. 2015. Model repair - Aligning process models to reality. Information Systems 47 (jan 2015), 220–243.

Renato César Borges Ferreira, Thanner Soares Silva, Diego Toralles Avila, Lucinéia Heloisa Thom, and Marcelo Fantinato. 2018. Recognition of Business Process Elements in Natural Language Texts. In Enterprise Information Systems. Springer International Publishing, Cham, Switzerland, 591–610.

Fabian Friedrich, Jan Mendling, and Frank Puhlmann. 2011. Process Model Generation from Natural Language Text. In Advanced Information Systems Engineering, Haralambos Mouratidis and Colette Rolland (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 482–496.

Dan Jurafsky and James H Martin. 2009. Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition., 1024 pages.

Wolfgang Kratsch, Jonas Manderscheid, Maximilian Röglinger, and Johannes Seyfried. 2020. Machine Learning in Business Process Monitoring: A Comparison of Deep Learning and Classical Approaches Used for Outcome Prediction. Business & Information Systems Engineering 63, 3 (apr 2020), 261–276.

Object Management Group OMG. 2011. Business Process Model and Notation (BPMN) Version 2.0. Standard. Object Management Group (OMG).

Ioannis Prasidis, Nikolaos-Paraskevas Theodoropoulos, Alexandros Bousdekis, Georgia Theodoropoulou, and Georgios Miaoulis. 2021. Handling Uncertainty in Predictive Business Process Monitoring with Bayesian Networks, In 2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA). 2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA), 1–8.

Efrén Rama-Maneiro, Juan C. Vidal, and Manuel Lama. 2020. Deep Learning for Predictive Business Process Monitoring: Review and Benchmark.

Thanner Soares Silva, Diego Toralles Avila, Jean Ampos Flesch, Sarajane Marques Peres, Jan Mendling, and Lucineia Heloisa Thom. 2019. A Service-Oriented Architecture for Generating Sound Process Descriptions, In 2019 IEEE 23rd International Enterprise Distributed Object Computing Conference (EDOC). 2019 IEEE 23rd International Enterprise Distributed Object Computing Conference (EDOC), 1–10.

Thanner Soares Silva, Lucinéia Heloisa Thom, Aline Weber, José Palazzo Moreira de Oliveira, and Marcelo Fantinato. 2018. Empirical Analysis of Sentence Templates and Ambiguity Issues for Business Process Descriptions. In Lecture Notes in Computer Science. Springer International Publishing, Cham, Switzerland, 279–297.

Leonardo Silva Rosa, Thanner Soares Silva, Marcelo Fantinato, and Lucineia Heloisa Thom. 2022. A visual approach for identification and annotation of business process elements in process descriptions. Computer Standards & Interfaces 81 (apr 2022), 103601.

Han van der Aa, Josep Carmona, Henrik Leopold, Jan Mendling, and Lluís Padró. 2018. Challenges and Opportunities of Applying Natural Language Processing in Business Process Management. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, New Mexico, USA, 2791–2801.

Wil van der Aalst, Arya Adriansyah, and Boudewijn van Dongen. 2012. Replaying history on process models for conformance checking and performance analysis. WIREs Data Mining and Knowledge Discovery 2, 2 (jan 2012), 182–192.

Wil M.P. Van Der Aalst. 2016. Process Mining. Vol. 5. Springer Berlin Heidelberg. 301–317 pages.

Ian H. Witten, Eibe Frank, and Mark A. Hall. 2011. Data Mining: Practical Machine Learning Tools and Techniques. Elsevier.
AVILA, Diego Toralles; MOURA, Vitor Camargo de; THOM, Lucineia Heloisa. Using Machine Learning to Classify Process Model Elements for Process Infrastructure Analysis. In: SIMPÓSIO BRASILEIRO DE SISTEMAS DE INFORMAÇÃO (SBSI), 19. , 2023, Maceió/AL. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 .