Tender Documents Information Extraction
Resumo
This paper presents the development of information extraction from tender documents, focusing on technology products. The system integrates Natural Language Processing and machine learning techniques to extract relevant information from the documents. The proposed solution aims to optimize the time and accuracy of tender document analysis by dealing with the complexity and diversity of data present in the notices. The experimental results demonstrate the effectiveness of identifying bidding items, highlighting their potential for practical application in public procurement processes.
Referências
ConLicitação (2025). Conlicitação. Acessado em 05-10-2025.
da República, P. (2021). Lei de licitações e contratos administrativos. last accessed 18 jul. 2024.
da Silva, F., Guimarães, G., Marcacini, R., Queiroz, A., Borges, V., Faleiros, T., and Garcia, L. (2022). Named entity recognition approaches applied to legal document segmentation. In Anais do X Symposium on Knowledge Discovery, Mining and Learning, pages 210–217. SBC.
da União, C.-G. (2024). Portal da transparência. last accessed 18 jul. 2024.
dos Santos Chaves, E. (2015). Aspectos importantes da fase interna da licitação: uma análise sobre o conjunto de elementos necessários e suficientes para a caracterização do objeto do processo licitatório. Revista Controle: Doutrinas e artigos, 13(1):149–170.
Hazboun, F., Owda, M., and Owda, A. (2021). A natural language interface to relational databases using an online analytic processing hypercube. AI, 2(4):720–737.
Ito, T. and Nakagawa, S. (2024). Tender document analyzer with the combination of supervised learning and llm-based improver. In Companion Proceedings of the ACM Web Conference 2024 (WWW ’24 Companion), New York, NY, USA. ACM.
Kang, Y., Cai, Z., Tan, C., Huang, Q., and Liu, H. (2020). Natural language processing (nlp) in management research: A literature review. Journal of Management Analytics, 7(2):139–172.
Lavanya, P. and Sasikala, E. (2021). Deep learning techniques on text classification using natural language processing (nlp) in social healthcare network: A comprehensive survey. In 2021 3rd International Conference on Signal Processing and Communication (ICPSC), pages 603–609.
Loeza-Mejía (2024). Comparative study of kdd and crisp-dm methodologies. In Proceedings of Ninth International Congress on Information and Communication Technology: ICICT 2024, London, Volume 3, volume 1013, page 317. Springer Nature.
Lou, J., Lu, Y., Dai, D., Jia, W., Lin, H., Han, X., Sun, L., and Wu, H. (2023). Universal information extraction as unified semantic matching. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 13318–13326.
Luca, C. (2025). Natural language processing (nlp) for document analysis.
OpenAI (2024). Gpt-4 technical report.
RHS Licitações (2025). Rhs licitações. Acessado em 05-10-2025.
Schröer, C., Kruse, F., and Gómez, J. M. (2021). A systematic literature review on applying crisp-dm process model. Procedia Computer Science, 181:526–534.
Silva, E., Medeiros, I., Menezes, M., and Kamikawachi, D. (2024). Segmentation and summarization for extracting information about information technology equipment from government procurement notice. In Anais do XII Symposium on Knowledge Discovery, Mining and Learning, pages 145–152. SBC, Porto Alegre, RS, Brasil.
Wang, B., Yin, W., Lin, X., and Xiong, C. (2021). Learning to synthesize data for semantic parsing. arXiv preprint. [link].
Xu, D., Chen, W., Peng, W., Zhang, C., Xu, T., Zhao, X., Wu, X., Zheng, Y., Wang, Y., and Chen, E. (2024). Large language models for generative information extraction: a survey. Frontiers of Computer Science, 18(6):186357.
