Detecting Inconsistencies in Public Bids: An Automated and Data-based Approach
Resumo
One application for using government data is the detection of irregularities that may indicate fraud in the public sector. This paper presents an approach that analyzes public bidding data available on the Web to detect bidder inconsistencies. Specifically, we propose a hierarchical decision approach from public bidding data, where each bidder is classified as Valid, Doubtful, or Invalid, based on the compatibility between the bidding items and the divisions of the CNAE codes (National Classification of Economic activities). The results reveal that combining commonly available data on bidders and extracting the description of bid items can help in fraud detection. Furthermore, the proposed approach can reduce the number of bids a specialist must analyze to detect fraud, making it easier to identify inconsistencies.
Referências
Carlos Assis, Adriano Pereira, Marconi Pereira, and Eduardo Carrano. 2013. Using genetic programming to detect fraud in electronic transactions. In WebMedia (Salvador, Brasil). ACM, New York, USA, 337–340. https://doi.org/10.1145/2526188.2526221
Bart Baesens, Sebastiaan Höppner, and Tim Verdonck. 2021. Data engineering for fraud detection. Decision Support Systems 150 (2021), 113492. https://doi.org/10.1016/j.dss.2021.113492
Olivier Caelen. 2017. A Bayesian interpretation of the confusion matrix. Annals of Mathematics and Artificial Intelligence 81, 3(2017), 429–450. https://doi.org/10.1007/s10472-017-9564-8
Lucas L. Costa 2022. Alertas de fraude em licitações: Uma abordagem baseada em redes sociais. In BraSNAM (Niterói, Brasil). SBC, Porto Alegre, Brasil, 37–48. https://doi.org/10.5753/brasnam.2022.223175
Geanderson Esteves 2020. Understanding machine learning software defect predictions. Automated Software Engineering 27, 3 (2020), 369–392. https://doi.org/10.1007/s10515-020-00277-4
Rayelle Ingrid Vera Cruz Silva Muniz and Bernadette Farias Lóscio. 2018. Publicação de Dados Abertos Conectados Sobre os Transplantes Realizados no IMIP. In SBBD WTDBD (Rio de Janeiro, Brasil). SBC, Porto Alegre, Brasil. [link].
Ana Luiza Pedrosa Paschoal, Nálbia Araújo Santos, and Walmer Faroni. 2020. Diamante da fraude: evidências empíricas nos relatórios de demandas externas do Ministério da Transparência e Controladoria Geral da União (CGU) dos municípios brasileiros. Revista Ambiente Contábil 12, 2 (2020), 136–156. https://doi.org/10.21680/2176-9036.2020v12n2ID18732
Rinky D. Patel and Dheeraj Kumar Singh. 2013. Credit card fraud detection & prevention of fraud using genetic algorithm. Intl. Journal of Soft Comp. and Eng. 2, 6 (2013), 292–294.
G. Jaculine Priya and S. Saradha. 2021. Fraud Detection and Prevention Using Machine Learning Algorithms: A Review. In ICEES(Chennai, India). IEEE, 564–568. https://doi.org/10.1109/ICEES51510.2021.9383631
Efrat Shimron 2022. Implicit data crimes: Machine learning bias arising from misuse of public data. the National Academy of Sciences 119, 13 (2022), e2117203119. https://doi.org/10.1073/pnas.2117203119
Glauco Vasconcelos Soares and Rodrigo Cunha. 2020. Predição de Irregularidade Fiscal dos Contribuintes do Tributo ISS. In SBBD (Online). SBC, Porto Alegre, Brasil, 223–228. https://doi.org/10.5753/sbbd.2020.13645
Adriano Veloso 2003. Efficient, Accurate and Privacy-Preserving Data Mining for Frequent Itemsets in Distributed Databases. In SBBD (Manaus, Brasil). SBC, Porto Alegre, Brasil, 281–292.
Allyson Vilela, André Almeida, and Frederico Lopes. 2018. OpenData Processor: An Automation tool for the process of extracting and publishing open data to CKAN. In WebMedia WFA (Salvador, Brasil). SBC, Porto Alegre, Brasil, 97–101. https://doi.org/10.5753/webmedia.2018.4576