Tractable Mode-Finding in Sum-Product Networks with Gaussian Leaves
Resumo
In this work, we leverage the relation between Sum-Product Networks (SPNs) and Gaussian mixtures to propose an algorithm that adapts the Expectation-Maximization method to efficiently find the modes of SPNs with Gaussian leaves. We discuss how the algorithm can be used to perform Maximum-A-Posteriori inference in SPNs learned from continuous data with theoretical advantages over the existing methods in the literature, and how it can be used to shrink the size of learned models. As an additional example of the use of the algorithm, we perform an SPN-based hierarchical clustering of digit images. Thus, our proposed algorithm can be used for model analysis, model compression, and exploratory data analysis.
Referências
Amer, M. R. and Todorovic, S. (2016). Sum Product Networks for Activity Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(4):800-813.
Carreira-Perpiñán, M. Á. (2000). Mode-finding for mixtures of gaussian distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11):1318-1323.
Carreira-Perpiñán, M. Á. and Williams, C. K. (2003a). An isotropic Gaussian mixture can have more modes than components. Institute for Adaptive and Neural Computation, 4(2).
Carreira-Perpiñán, M. Á. and Williams, C. K. (2003b). On the number of modes of a Gaussian mixture. In Griffin, L. D. and Lillholm, M., editors, Scale Space Methods in Computer Vision, volume 2695, pages 625-640, Berlin, Heidelberg. Springer Berlin Heidelberg.
Cheng, W. C., Kok, S., Pham, H. V., Chieu, H. L., and Chai, K. M. A. (2014). Language modeling with sum-product networks. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pages 2098-2102.
Conaty, D., Mauá, D. D., and de Campos, C. P. (2017). Approximation Complexity of Maximum A Posteriori Inference in Sum-Product Networks. In Elidan, G. and Kersting, K., editors, Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, pages 322-331. AUAI Press.
Darwiche, A. (2003). A differential approach to inference in Bayesian networks. Journal of the ACM, 50(3):280-305.
Gens, R. and Domingos, P. (2013). Learning the structure of sum-product networks. In Dasgupta, S. and McAllester, D., editors, Proceedings of the 30th International Conference on Machine Learning, pages 873-880, Atlanta, GA, USA. PMLR.
Hsu, W., Kalra, A., and Poupart, P. (2017). Online Structure Learning for Sum-Product Networks with Gaussian Leaves.
Jaini, P., Rashwan, A., Zhao, H., Liu, Y., Banijamali, E., Chen, Z., and Poupart, P. (2016). Online Algorithms for Sum-Product Networks with Continuous Variables. In Antonucci, A., Corani, G., and Campos, C. P., editors, Proceedings of the Eighth International Conference on Probabilistic Graphical Models, pages 228-239, Lugano, Switzerland. PMLR.
Koller, D. and Friedman, N. (2009). Probabilistic Graphical Models: Principles and Techniques.
Lee, S.-W., Heo, M.-O., and Zhang, B.-T. (2013). Online Incremental Structure Learning of Sum-Product Networks. In LNCS, volume 8227, pages 220-227.
Li, J., Ray, S., and Lindsay, B. G. (2007). A Nonparametric Statistical Approach to Clustering via Mode Identification. Journal of Machine Learning Research, 8(59):1687- 1723.
Llerena, J. V. and Mauá, D. D. (2017). On using sum-product networks for multi-label classification. In Proceedings of the 6th Brazilian Conference on Intelligent Systems, BRACIS, pages 25-30.
Mauá, D. D., Ribeiro, H. R., Katague, G. P., and Antonucci, A. (2020). Two Reformulation Approaches to Maximum-A-Posteriori Inference in Sum-Product Networks. In Jaeger, M. and Nielsen, T. D., editors, Proceedings of the Tenth International Conference on Probabilistic Graphical Models, volume 138 of Proceedings of Machine Learning Research, pages 293-304. PMLR.
Mei, J., Jiang, Y., and Tu, K. (2018). Maximum A Posteriori Inference in Sum-Product Networks. In AAAI Conference on Artificial Intelligence.
Molina, A., Vergari, A., Stelzner, K., Peharz, R., Subramani, P., Mauro, N. D., Poupart, P., and Kersting, K. (2019). Spflow: An easy and extensible library for deep probabilistic learning using sum-product networks.
Park, J. D. (2002). MAP Complexity Results and Approximation Methods. In Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence, pages 388-396, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.
Peharz, R. (2015). Foundations of Sum-Product Networks for Probabilistic Modeling. PhD thesis, Graz University of Technology.
Peharz, R., Gens, R., Pernkopf, F., and Domingos, P. (2016). On the Latent Variable Interpretation in Sum-Product Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(10):2030-2044.
Peharz, R., Kapeller, G., Mowlaee, P., and Pernkopf, F. (2014). Modeling speech with sum-product networks: Application to bandwidth extension. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing Proceedings, pages 3699-3703. IEEE.
Poon, H. and Domingos, P. (2011). Sum-product networks: A new deep architecture. In 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pages 689-690. IEEE.
Rashwan, A., Zhao, H., and Poupart, P. (2016). Online and distributed Bayesian moment matching for parameter learning in sum-product networks. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016, pages 1469-1477.
Rooshenas, A. and Lowd, D. (2014). Learning sum-product networks with direct and indirect variable interactions. In 31st International Conference on Machine Learning, ICML 2014, pages I-710-I-718, Beijing, China. https://JMLR.org.
Russell, S. and Norvig, P. (2010). Artificial Intelligence A Modern Approach. 3 edition.
Zhao, H., Poupart, P., and Gordon, G. (2016). A unified approach for learning the parameters of Sum-Product Networks. In Advances in Neural Information Processing Systems, pages 433-441. Curran Associates, Inc.