How to proper initialize Gaussian Mixture Models with Optimum-Path Forest
Resumo
In this paper, we proposed a fast and scalable unsupervised Optimum-Path Forest for improving the initialization of Gaussian mixture models. Taking advantage of Optimum-Path Forest attributes such as on-the-fly number of clusters estimation and its intrinsic non-parametric nature, we exploited the k Approximate Nearest Neighbors graph to build its adjacency relation, enabling it not only to initialize the Expectation-Maximization algorithm but to be employed for clustering on large datasets as well. From experiments conducted on eight datasets, the results indicated the proposed approach is able to encode Gaussian parameters more naturally and intuitively compared to other clustering algorithms such as k-means. Furthermore, the proposed approach has shown great scalability, making it a viable alternative to traditional Optimum-Path Forest clustering