Uma Proposta de Melhoria do Algoritmo Guloso de Estimação de Mistura de Gaussianas

Andre Paim Lemos; Antonio Pádua Braga

Andre Paim Lemos UFMG
Antonio Pádua Braga UFMG

Resumo

Este trabalho propõe modificações no critério de parada do algoritmo guloso de estimação de Misturas de Gaussianas, com o objetivo de melhorar sua acurácia na busca pelo número de componentes ótimo. Neste trabalho o critério de parada desse algoritmo é modificado para utilizar um teste de normalidade multivariado amostral, de forma que o algoritmo para quando todas as componentes da mistura passem nesse teste. O algoritmo modificado é comparado com o algoritmo original, que utiliza critérios de parcimônia como critério de parada. Resultados de simulações núméricas sugerem a melhoria na acurácia quando o criterio de parada proposto neste trabalho é utilizado.

Referências

Biernacki, C., Celeux, G., and Govaert, G. (1999). An improvement of the nec criterion for assessing the number of clusters in a mixture model. Non-Linear Anal., 20(3):267–272.

Bilmes, J. (1998). A gentle tutorial of the em algorithm and its application to parameter estimation for gaussian mixture and hidden markov models. Technical report.

Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1):1–38.

Duda, R., Hart, P., and Stork, D. (2001). Pattern classification. John Wiley & Sons, Inc., New York, NY, USA.

Figueiredo, M. A. T., Figueiredo, M. A. T., and Jain, A. K. (2000). Unsupervised learning of finite mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24:381–396.

Li, L. and Ma, J. (2008). A BYY scale-incremental EM algorithm for Gaussian mixture learning. Applied Mathematics and Computation, 205(2, Sp. Iss. SI):832–840.

Mclachlan, G. J. (1987). On bootstrapping the likelihood ratio test stastistic for the number of components in a normal mixture. Applied Statistics, 36(3):318–324.

Papoulis, A. (1984). Probability, Random Variables, and Stochastic Processes. Mc-Graw Hill.

Ueda, N. and Nakano, R. (1998). Deterministic annealing em algorithm. Neural Netw., 11(2):271–282.

Verbeek, J. J., Vlassis, N., and Kröse, B. (2003). Efficient greedy learning of gaussian mixture models. Neural Comput., 15(2):469–485.

Ververidis, D. and Kotropoulos, C. (2005). Emotional speech classification using gaussian mixture models and the sequential floating forward selection algorithm. In Multimedia and Expo, 2005. ICME 2005. IEEE International Conference on, pages 1500–1503.

Ververidis, D. and Kotropoulos, C. (2008). Gaussian mixture modeling by exploiting the mahalanobis distance. Signal Processing, IEEE Transactions on, 56(7):2797–2811.