Housing Prices Prediction with a Deep Learning and Random Forest Ensemble

Bruno Afonso; Luckeciano Melo; Willian Oliveira; Samuel Sousa; Lilian Berton

doi:10.5753/eniac.2019.9300

Bruno Afonso Universidade Federal de São Paulo
Luckeciano Melo Instituto Tecnológico da Aeronáutica
Willian Oliveira Universidade Federal de São Paulo
Samuel Sousa Universidade Federal de São Paulo
Lilian Berton Universidade Federal de São Paulo

DOI: https://doi.org/10.5753/eniac.2019.9300

Resumo

The development of a housing prices prediction model can assist a house seller or a real estate agent to make better-informed decisions based on house price valuation. Only a few works report the use of machine learning (ML) algorithms to predict the values of properties in Brazil. This study analyzes a dataset composed of 12,223,582 housing advertisements, collected from Brazilian websites from 2015 to 2018. Each instance comprises twenty-four features of five different data types: integer, date, string, float, and image. To predict the property prices, we ensemble two different ML architectures, based on Random Forest (RF) and Recurrent Neural Networks (RNN). This study demonstrates that enriching the dataset and combining different ML approaches can be a better alternative for prediction of housing prices in Brazil.

Palavras-chave: Housing prices prediction, Machine Learning, Ensemble, Random Forest, Deep Learning, Recurrent Neural Networks

Referências

Alpaydin, E. (2009). Introduction to machine learning. MIT press.

Associação Brasileira de Incorporadoras Imobiliárias – ABRAINC (2017). Análise das necessidades habitacionais e suas tendências para os próximos dez anos. https://www.abrainc.org.br/wp-content/uploads/2018/10/ ANEHAB-Estudo-completo.pdf.

Breiman, L. (2001). Random forests. Machine Learning, 45(1):5–32.

De Souza, F. A. (1999). Land tenure security and housing improvements in recife, brazil. Habitat International, 23(1):19–33.

Fan, C., Cui, Z., and Zhong, X. (2018). House prices prediction with machine learning algorithms. In Proceedings of the 2018 10th International Conference on Machine Learning and Computing, pages 6–10. ACM.

Fik, T. J., Ling, D. C., and Mulligan, G. F. (2003). Modeling spatial variation in housing prices: a variable interaction approach. Real Estate Economics, 31(4):623–646.

Goodman, A. C. and Thibodeau, T. G. (2003). Housing market segmentation and hedonic prediction accuracy. Journal of Housing Economics, 12(3):181–201.

Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8):1735–1780.

Huang, G., Liu, Z., v. d. Maaten, L., and Weinberger, K. Q. (2017). Densely connected convolutional networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2261–2269.

Moreira de Aguiar, M., Simões, R., and Braz Golgher, A. (2014). Housing market analysis using a hierarchical–spatial approach: the case of belo horizonte, minas gerais, brazil. Regional Studies, Regional Science, 1(1):116–137.

Nair, V. and Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10), pages 807–814.

Park, B. and Bae, J. K. (2015). Using machine learning algorithms for housing price prediction: The case of fairfax county, virginia housing data. Expert Systems with Applications, 42(6):2928–2934.

Poursaeed, O., Matera, T., and Belongie, S. (2018). Vision-based real estate price estimation. Machine Vision and Applications, 29(4):667–676.

Raudenbush, S. W. and Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods, volume 1. SAGE Publications, Inc., 2 edition.

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. CoRR, abs/1707.06347.

Schuster, M. and Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11):2673–2681.

Selim, H. (2009). Determinants of house prices in turkey: Hedonic regression versus artificial neural network. Expert systems with Applications, 36(2):2843–2852.

Sirignano, J., Sadhwani, A., and Giesecke, K. (2016). Deep learning for mortgage risk. SSRN Electronic Journal, pages 1–75.

Wu, L. and Brynjolfsson, E. (2015). The future of prediction: How google searches foreshadow housing prices and sales. In Economic analysis of the digital economy, pages 89–118. University of Chicago Press.

Zhang, A., Lipton, Z. C., Li, M., and Smola, A. J. (2019). Dive into Deep Learning. http://www.d2l.ai.

Zoph, B. and Le, Q. V. (2016). Neural architecture search with reinforcement learning. CoRR, abs/1611.01578.

Zoph, B., Vasudevan, V., Shlens, J., and Le, Q. V. (2018). Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8697–8710.

Housing Prices Prediction with a Deep Learning and Random Forest Ensemble

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)