Public Administration Suppliers Classification Model Based on Supervised Machine Learning

  • Joselito Mendes de Sousa Júnior Universidade Federal do Piauí
  • Vinícius Ponte Machado Universidade Federal do Piauí
  • Rodrigo de Melo Souza Veras Universidade Federal do Piauí
  • Roney Lira de Sales Santos Universidade de São Paulo
  • Bruno Vicente Alves de Lima Instituto Federal do Maranhão
  • Aline Montenegro Leal Silva Universidade Federal do Piauí
  • Francisco Alysson da Silva Sousa Universidade Federal do Piauí
  • Francisco das Chagas Imperes Filho Universidade Federal do Piauí


Context: Public contracts are agreements made between the Public Administration and individuals to achieve public interest objectives. Within this relationship, some problems such as contractual breaches may occur. In the Piauí State, the Audit Court, environment in which this research was developed, is responsible for analyzing and judging the accountability of the Legislative, Executive and Judiciary Powers.. Problem: For government control bodies, the challenge generated is to act efficiently in the identification of fraud and corruption. To audit all processes, there is an unfeasible number of records to be analyzed by each auditor. Solution: Optimize the choice of processes to be audited, given the infeasibility of a full census. Thus, the present work uses Machine Learning (ML) techniques to assist in the selection of which ones will be audited. IS theory: Machine learning studies the computational methods that allow computer programs to autonomously obtain an improvement in a given task through experiments. Method: After the preparation applying the balancing and normalization of the base provided by the Audit Court that gathers other datasets about suppliers, experiments were carried out and the J48 algorithm was identified as the most appropriate for classification through the decision tree structure. Summary of Results: The constructed model resulted in a correct classification rate above 82% to solve the problem of classifying suppliers as high and/or low risk. Contributions and Impact in the IS area: The resulting classification model is expected to serve as support for an automatic supplier evaluation and classification system.
Palavras-chave: Classification, Machine Learning, Supervised Learning, Suppliers Evaluation System


Rakesh Agrawal and Ramakrishnan Srikant. 1994. Fast Algorithms for mining association rules in large databases. Proceedings of the 20th International Conference on Very Large Data Bases (1994), 487–499. 

J. Arrieta and A. Mera, C.and Espinosa. 2020. Evaluation of Weakly Supervised Learning Paradigms on Automatic Visual Inspection.IEEE Latin America Transactions 18, 6 (2020), 1017–1025. 

F. Assef, M. Steiner, P. Steiner, and D. Franco. 2019. Classification Algorithms in Financial Application Credit Risk Analysis on Legal Entities. IEEE Latin America Transactions 17, 10 (2019), 1733–1740. 

Remis Balaniuk, Pierre Bessiere, Emmanuel Mazer, and Paulo Cobbe. 2012. Risk based government audit planning using naive bayes classifiers. In Knowledge-Based and Intelligent Information and Engineering Systems. s.n., Spain. 

Ricardo Silva Carvalho and Rommel Novaes Carvalho. 2016. Bayesian models to assess risk of corruption of federal management units. In 13th Bayesian Modelling Applications Workshop. s.n. 

A. P. Dempster. 1967. Upper and lower probabilities induced by a multivalued mapping. The Annals of Mathematical Statistics(1967), 325–339. 

Lakshmi Devacena. 2015. Comparative Analysis of Random Forest, REP Tree and J48 Classifiers for Credit Risk Prediction. International Journal of Computer Applications3 (2015), 7 pages. 

Maria Sylvia Zanella di Pietro. 1999. Direito Administrativo(1 ed.). Atlas, São Paulo. 

Tin Kam Ho. 1995. Random Decision Forests. Proceedings of the 3rd International Conference on Document Analysis and Recognition (1995), 278–282. 

Solange Rubert Librelotto and Patricia Mariotto Mozzaquatro. 2013. Análise dos Algoritmos de Mineração J48 e Apriori aplicados na Detecção de indicadores da Qualidade de Vida e Saúde. Revista Interdiciplinar de Ensino, Pesquisa e Extensão 1, 1 (2013), 12 pages. 

M. E. Maron. 1961. Automatic Indexing: An Experimental Inquiry. J. ACM (1961), 404–407. 

Thomas M. Mitchell. 1997. Machine Learning (1ed.). McGraw-Hill, Inc., New York, NY, USA. 

E. W. T. Ngai, Yong Hu, Y. H. Wong, Yijun Chen, and Xin Sun. 2011. The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature.Decision Support Systems 50, 3 (2011), 559–569. 

Suvasini Panigrahi, Amlan Kundu, Shamik Sural, and A. K. Majumdar. 2009. Credit Card Fraud Detection: A Fusion Approach Using Dempster-Shafer Theory and Bayesian Learning. Inf. Fusion 10, 4 (2009), 10 pages. 

J. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. 

Waad Al Saidi and Ahmed M. Zeki. 2019. The use of data mining techniques in crime prevention and prediction. In 2nd Smart Cities Symposium (SCS 2019). 1–4.

D. Semedo. 2010. Credit Scoring: Aplicaçao da Regressão Logística vs Redes Neuronais Artificiais na Avaliação do Risco de Crédito no Mercado Cabo-Verdiano. Mestrado. Instituto Superior de Estatística e Gestão da Informação, Universidade Nova de Lisboa. 
Como Citar

Selecione um Formato
DE SOUSA JÚNIOR, Joselito Mendes; MACHADO, Vinícius Ponte; SOUZA VERAS, Rodrigo de Melo; DE SALES SANTOS, Roney Lira; ALVES DE LIMA, Bruno Vicente; LEAL SILVA, Aline Montenegro; DA SILVA SOUSA, Francisco Alysson; IMPERES FILHO, Francisco das Chagas. Public Administration Suppliers Classification Model Based on Supervised Machine Learning. In: SIMPÓSIO BRASILEIRO DE SISTEMAS DE INFORMAÇÃO (SBSI), 18. , 2022, Curitiba. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 .

Artigos mais lidos do(s) mesmo(s) autor(es)