Vector space models for trace clustering: a comparative study

  • Mateus Alex dos Santos Luna USP
  • André Paulino Lima USP
  • Thaís Rodrigues Neubauer USP
  • Marcelo Fantinato USP
  • Sarajane Marques Peres USP


Process mining explores event logs to offer valuable insights to business process managers. Some types of business processes are hard to mine, including unstructured and knowledge-intensive processes. Then, trace clustering is usually applied to event logs aiming to break it into sublogs, making it more amenable to the typical process mining task. However, applying clustering algorithms involves decisions, such as how traces are represented, that can lead to better results. In this paper, we compare four vector space models for trace clustering, using them with an agglomerative clustering algorithm in synthetic and real-world event logs. Our analyses suggest the embeddings-based vector space model can properly handle trace clustering in unstructured processes.


LUNA, Mateus Alex dos Santos; LIMA, André Paulino; NEUBAUER, Thaís Rodrigues; FANTINATO, Marcelo; PERES, Sarajane Marques. Vector space models for trace clustering: a comparative study. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 18. , 2021, Evento Online.

