Building a Dataset Related to Production and Marketing of Horticulture Products in Brazil

  • Guilherme Alan Mohr UFSM
  • Gustavo Pinto da Silva UFSM
  • Janaína Balk Brandão UFSM
  • Daniel Lichtnow UFSM

Abstract


This paper describes the process of building a dataset that gathers public data related to the production and marketing of horticulture and fruticulture products in Brazil, extracted from various sources using the Web Scraping process. To compose the initial version of the dataset, data was extracted from the 2010 Demographic Census, the Brazilian Institute of Geography and Statistics' (IBGE) Automatic Recovery System (SIDRA), and the National Supply Company (CONAB). Finally, a description of the extracted data and potential use cases is presented

References

Brandão, J. B. et al. (2023) Mercados e canais de comercialização na região central do RS: fatores relevantes para os produtores de frutas e hortaliças. Ciência Rural, 53

Diouf, Rabiyatou et al. (2019) Web scraping: state-of-the-art and areas of application. In: IEEE International Conference on Big Data (Big Data). IEEE. p. 6040-6042.

Medeiros, A. M. A., Gonçalves, E. C. (2023) Estudo Comparativo de Estratégias para o Pareamento de Nomes de Entidades na Língua Portuguesa. In: Anais XVIII ERBD.

Meira, C. A. A. et al. (2002) Análise da produção brasileira de frutas a partir do armazém de dados da fruticultura. Campinas, SP: Embrapa. 6 p. Disponível em: [link]. Acesso em: jun/23
Published
2024-04-10
MOHR, Guilherme Alan; SILVA, Gustavo Pinto da; BRANDÃO, Janaína Balk; LICHTNOW, Daniel. Building a Dataset Related to Production and Marketing of Horticulture Products in Brazil. In: REGIONAL DATABASE SCHOOL (ERBD), 19. , 2024, Farroupilha/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 162-165. ISSN 2595-413X. DOI: https://doi.org/10.5753/erbd.2024.238839.