Housing Prices Prediction with a Deep Learning and Random Forest Ensemble

  • Bruno Afonso Universidade Federal de São Paulo
  • Luckeciano Melo Instituto Tecnológico da Aeronáutica
  • Willian Oliveira Universidade Federal de São Paulo
  • Samuel Sousa Universidade Federal de São Paulo
  • Lilian Berton Universidade Federal de São Paulo


The development of a housing prices prediction model can assist a house seller or a real estate agent to make better-informed decisions based on house price valuation. Only a few works report the use of machine learning (ML) algorithms to predict the values of properties in Brazil. This study analyzes a dataset composed of 12,223,582 housing advertisements, collected from Brazilian websites from 2015 to 2018. Each instance comprises twenty-four features of five different data types: integer, date, string, float, and image. To predict the property prices, we ensemble two different ML architectures, based on Random Forest (RF) and Recurrent Neural Networks (RNN). This study demonstrates that enriching the dataset and combining different ML approaches can be a better alternative for prediction of housing prices in Brazil.

Palavras-chave: Housing prices prediction, Machine Learning, Ensemble, Random Forest, Deep Learning, Recurrent Neural Networks


Alpaydin, E. (2009). Introduction to machine learning. MIT press.

Associação Brasileira de Incorporadoras Imobiliárias – ABRAINC (2017). Análise das necessidades habitacionais e suas tendências para os próximos dez anos. https://www.abrainc.org.br/wp-content/uploads/2018/10/ ANEHAB-Estudo-completo.pdf.

Breiman, L. (2001). Random forests. Machine Learning, 45(1):5–32.

De Souza, F. A. (1999). Land tenure security and housing improvements in recife, brazil. Habitat International, 23(1):19–33.

Fan, C., Cui, Z., and Zhong, X. (2018). House prices prediction with machine learning algorithms. In Proceedings of the 2018 10th International Conference on Machine Learning and Computing, pages 6–10. ACM.

Fik, T. J., Ling, D. C., and Mulligan, G. F. (2003). Modeling spatial variation in housing prices: a variable interaction approach. Real Estate Economics, 31(4):623–646.

Goodman, A. C. and Thibodeau, T. G. (2003). Housing market segmentation and hedonic prediction accuracy. Journal of Housing Economics, 12(3):181–201.

Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8):1735–1780.

Huang, G., Liu, Z., v. d. Maaten, L., and Weinberger, K. Q. (2017). Densely connected convolutional networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2261–2269.

Moreira de Aguiar, M., Simões, R., and Braz Golgher, A. (2014). Housing market analysis using a hierarchical–spatial approach: the case of belo horizonte, minas gerais, brazil. Regional Studies, Regional Science, 1(1):116–137.

Nair, V. and Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10), pages 807–814.

Park, B. and Bae, J. K. (2015). Using machine learning algorithms for housing price prediction: The case of fairfax county, virginia housing data. Expert Systems with Applications, 42(6):2928–2934.

Poursaeed, O., Matera, T., and Belongie, S. (2018). Vision-based real estate price estimation. Machine Vision and Applications, 29(4):667–676.

Raudenbush, S. W. and Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods, volume 1. SAGE Publications, Inc., 2 edition.

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. CoRR, abs/1707.06347.

Schuster, M. and Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11):2673–2681.

Selim, H. (2009). Determinants of house prices in turkey: Hedonic regression versus artificial neural network. Expert systems with Applications, 36(2):2843–2852.

Sirignano, J., Sadhwani, A., and Giesecke, K. (2016). Deep learning for mortgage risk. SSRN Electronic Journal, pages 1–75.

Wu, L. and Brynjolfsson, E. (2015). The future of prediction: How google searches foreshadow housing prices and sales. In Economic analysis of the digital economy, pages 89–118. University of Chicago Press.

Zhang, A., Lipton, Z. C., Li, M., and Smola, A. J. (2019). Dive into Deep Learning. http://www.d2l.ai.

Zoph, B. and Le, Q. V. (2016). Neural architecture search with reinforcement learning. CoRR, abs/1611.01578.

Zoph, B., Vasudevan, V., Shlens, J., and Le, Q. V. (2018). Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8697–8710.
AFONSO, Bruno; MELO, Luckeciano; OLIVEIRA, Willian; SOUSA, Samuel; BERTON, Lilian. Housing Prices Prediction with a Deep Learning and Random Forest Ensemble. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 16. , 2019, Salvador. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2019 . p. 389-400. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2019.9300.