Brazilian Presidential Elections: Analysing Voting Patterns in Time and Space Using a Simple Data Science Pipeline

  • Lucas Henrique Mantovani Jacintho Universidade de São Paulo
  • Tiago Pinho da Silva Universidade de São Paulo
  • Antonio Rafael Sabino Parmezan Universidade de São Paulo
  • Gustavo Enrique de Almeida Prado Alves Batista University of New South Wales


Since 1989, the first year of the democratic presidential election after a long period of a dictatorship regime, Brazil conducted eight presidential elections. This period was marked by short and long-term shifts of power and two impeachment processes. Such instability is a case of study in electoral studies, e.g., the study of the population voting behavior. Understanding patterns in the population behavior can give us insight into factors and influences that affect the quality of democratic political decisions. In light of this, our paper focuses on analyzing the Brazilian presidential election voting behavior across the years and the Brazilian territory. Following a data science pipeline, we divided the analysis process into five steps: (i) data selection; (ii) data preprocessing; (iii) identification of spatial patterns, in which we seek to understand the role of space in the election results using spatial autocorrelation techniques; (iv) identification of temporal patterns, where we investigate similar trends of votes over the years using a hierarchical clustering method; and (v) evaluation of the results. It is noteworthy that the data in this work represents the election results at the municipal level, from 1994 to 2018, of the two most relevant parties of this period: the Brazilian Social Democracy Party (PSDB) and the Workers’ Party (PT). Through the results obtained, we found the existence of spatial dependence in every electoral year investigated. Moreover, despite the changes in the political-economic context over the years, neighboring cities seem to present similar voting behavior trends.

Palavras-chave: data mining, machine learning, preferential voting, spatio-temporal patterns, voting behavior


Agnew, J. Maps and models in political studies: a reply to comments. Political Geography 15 (2): 165–167, 1996.

Anselin, L. Local indicators of spatial association—lisa. Geographical analysis 27 (2): 93–115, 1995.

Caliński, T. and Harabasz, J. A dendrite method for cluster analysis. Communications in Statistics 3 (1): 1–27, 1974.

Carvalho, R. and Menezes, T. Uma análise espacial das eleições presidenciais brasileiras de 2010. Pesquisa e Planejamento Econômico 45 (3): 436–495, 02, 2015.

Davies, D. L. and Bouldin, D. W. A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-1 (2): 224–227, 1979.

Han, J., Kamber, M., and Pei, J. Data mining: concepts and techniques. Morgan Kaufmann, California, 2011.

Jr., J. H. W. Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association 58 (301): 236–244, 1963.

Li, H., Calder, C. A., and Cressie, N. Beyond moran’s i: testing for spatial dependence based on the spatial autoregressive model. Geographical Analysis 39 (4): 357–375, 2007.

Mansley, E. and Demšar, U. Space matters: Geographic variability of electoral turnout determinants in the 2012 london mayoral election. Electoral Studies vol. 40, pp. 322–334, 2015.

Marzagão, T. A dimensão geográfica das eleições brasileiras. Opinião Pública 19 (2): 270–290, 2013.

Norris, P. and Grömping, M. Electoral integrity worldwide, 2019. Sydney: Electoral Integrity Project. Available at https://www. dropbox. com/s/csp1048mkwbrpsu/Electoral% 20Integrity% 20Worldwide. pd f.

Power, T. J. and Rodrigues-Silveira, R. Mapping ideological preferences in brazilian elections, 1994-2018: a municipal-level study. Brazilian Political Science Review 13 (1): e0001–1–27, 2019.

Rokach, L. and Maimon, O. Clustering methods. In Data Mining and Knowledge Discovery Handbook, O. Maimon and L. Rokach (Eds.). Springer, Boston, pp. 321–352, 2005.

Rousseeuw, P. J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics vol. 20, pp. 53 – 65, 1987.
JACINTHO, Lucas Henrique Mantovani; DA SILVA, Tiago Pinho; PARMEZAN, Antonio Rafael Sabino ; BATISTA, Gustavo Enrique de Almeida Prado Alves . Brazilian Presidential Elections: Analysing Voting Patterns in Time and Space Using a Simple Data Science Pipeline. In: SYMPOSIUM ON KNOWLEDGE DISCOVERY, MINING AND LEARNING (KDMILE), 8. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 217-224. ISSN 2763-8944. DOI: