Tractable Classification with Non-Ignorable Missing Data Using Generative Random Forests

Julissa Villanueva; Denis Mauá

doi:10.5753/kdmile.2022.227969

Julissa Villanueva Universidade de São Paulo
Denis Mauá Universidade de São Paulo

DOI: https://doi.org/10.5753/kdmile.2022.227969

Resumo

Missing data is abundant in predictive tasks. Typical approaches assume that the missingness process is ignorable or non-informative and handle missing data either by marginalization or heuristically. Yet, data is often missing in a non-ignorable way, which introduce bias in prediction. In this paper, we develop a new method to perform tractable predictive inference under non-ignorable missing data using probabilistic circuits derived from Decision Tree Classifiers and a partially specified response model of missingness. We show empirically that our method delivers less biased (probabilistic) classifications than approaches that assume missing at random and are more determinate than similar existing overcautious approaches.

Palavras-chave: generative random forests, probabilistic circuits, non-ignorable missing data

Referências

Azur, M. J., Stuart, E. A., Frangakis, C., and Leaf, P. J. Multiple imputation by chained equations: what is it and how does it work? International Journal of Methods in Psychiatric Research 20 (1): 40–49, 2011.

Choi, Y., Vergari, A., and Van den Broeck, G. Probabilistic circuits: A unifying framework for tractable probabilistic models, 2020.

Correia, A., Peharz, R., and de Campos, C. P. Joints in random forests. In Proceedings of the neural information processing systems. Vol. 33. pp. 11404–11415, 2020.

Cozman, F. G. Credal networks. Artificial intelligence 120 (2): 199–233, 2000.

Davis, J. and Domingos, P. Bottom-up learning of Markov network structure. In Proceedings of the 27th International Conference on Machine Learning (ICML). pp. 271–280, 2010.

Goldberg, K., Roeder, T., Gupta, D., and Perkins, C. Eigentaste: A constant time collaborative filtering algorithm. information retrieval 4 (2): 133–151, 2001.

Khosravi, P., Choi, Y., Liang, Y., Vergari, A., and Van den Broeck, G. On tractable computation of expected predictions. In Advances in Neural Information Processing Systems 32 (NeurIPS), 2019.

Kisa, D., Van den Broeck, G., Choi, A., and Darwiche, A. Probabilistic sentential decision diagrams. In Proceedings of the 14th International Conference on Principles of Knowledge Representation and Reasoning (PKDD). pp. 1–10, 2014.

Levi, I. The Enterprise of Knowledge. An Essay on Knowledge, Credal Probability, and Chance. MIT Press, Cambridge, 1980.

Liang, Y. and Van den Broeck, G. Learning logistic circuits. In Proceedings of the 33rd Conference on Artificial Intelligence (AAAI), 2019.

Little, R. J. A. and Rubin, D. B. Nonignorable Missing-Data Models, 2014.

Marlin, B. M., Zemel, R. S., Roweis, S. T., and Slaney, M. Recommender systems: Missing data and statistical model estimation. In Proceedings of the 22nd International Joint Conference in Artificial Intelligence (IJCAI), 2011.

Mauá, D. D., Conaty, D., Cozman, F. G., Poppenhaeger, K., and de Campos, C. P. Robustifying sum-product networks. International Journal of Approximate Reasoning, 2018.

Peharz, R., Gens, R., and Domingos, P. Learning selective sum-product networks. In Proceedings of the Workshop on Learning Tractable Probabilistic Models, 2014.

Peharz, R., Vergari, A., Stelzner, K., Molina, A., Shao, X., Trapp, M., Kersting, K., and Ghahramani, Z. Random sum-product networks: A simple and effective approach to probabilistic deep learning. In Proceedings of The 35th Uncertainty in Artificial Intelligence Conference (UAI), 2020.

Poon, H. and Domingos, P. Sum-product networks: A new deep architecture. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI). pp. 337–346, 2011.

Rahman, T., Kothalkar, P., and Gogate, V. Cutset networks: A simple, tractable, and scalable approach for improving the accuracy of Chow-Liu trees. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), T. Calders, F. Esposito, E. Hüllermeier, and R. Meo (Eds.). pp. 630–645, 2014.

Rubin, D. B. Inference and missing data. Biometrika 63 (3): 581–592, 1976.

Shao, X., Alejandro Molina, A. V., Stelzner, K., Peharz, R., Liebig, T., and Kersting, K. Conditional sumproduct networks: Imposing structure on deep probabilistic architectures. In Proceedings of the 10th International Conference on Probabilistic Graphical Models (PGM), 2020.

Shen, Y., Choi, A., and Darwiche, A. A tractable probabilistic model for subset selection. In Proceedings of the 33rd Conference on Uncertainty in Artificial Intelligence (UAI), 2017.

Shen, Y., Goyanka, A., Darwiche, A., and Choi, A. Structured Bayesian networks: From inference to learning with routes. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI), 2019.

Troffaes, M. C. Decision making under uncertainty using imprecise probabilities. International journal of approximate reasoning 45 (1): 17–29, 2007.

Villanueva, J., Mauá, D., and Antonucci, A. Cautious classification with data missing not at random using generative random forests. In European Conference on Symbolic and Quantitative Approaches with Uncertainty. Springer, pp. 284–298, 2021.

Zaffalon, M. The naive credal classifier. Journal of statistical planning and inference 105 (1): 5–21, 2002.

Zheng, K., Pronobis, A., and Rao, R. P. N. Learning graph-structured sum-product networks for probabilistic semantic maps. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI), 2018.