Creating Data Management Plans in Data Science Projects for Fake News Detection Supported by FAIR principles

  • Jorge Zavaleta Federal University of Rio de Janeiro http://orcid.org/0000-0002-4747-8613
  • Annatércia Pinheiro Federal University of Rio de Janeiro
  • Renato Cerceau National Institute of Cardiology / State University of Rio de Janeiro https://orcid.org/0000-0003-3953-4715
  • Cabral Lima Federal University of Rio de Janeiro
  • Maria Luiza Machado Campos Federal University of Rio de Janeiro
  • Sérgio Manuel Serra da Cruz Federal University of Rio de Janeiro / Federal Rural University of Rio de Janeiro

Abstract


Data Science researchers are experiencing an increasingly multifaceted reality concerning data governance. The paradigm shifts from disconnected data silos to data management plans (DMP) and standardized online repositories are adhering to the FAIR principles. This manuscript discusses, compares current DMP platforms, and describes the creation of a DMP in a Machine Learning project focused on Fake News detection. As a result, we describe a use case with the construction of the PGD on the DSWizard platform and offer an executable article that may be executed by the readers.
Keywords: Data Management, Plansfake newsfair principles, machine learning methods

References

Aguia. Plano de Gestão de Dados. Agência USP de Gestão Da Informação Acadêmica. https://www.aguia.usp.br/apoio-pesquisador/dados-pesquisa/planogestao-dados-2/, 2021.

Argos. Argos - Tool for Data Management Plan. https://argos.openaire.eu/splash/. 2021.

DataFirst. Welcome DataFirst. DataFirst. https://datafirst.uct.ac.za/. 2021.

DDI. Data Document Initiative. Document, Discover and Interoperate. https://ddialliance.org/. 2021.

DLS. Stages of Research Data Management: Share & Publish. Digital Library Services. http://www.digitalservices.lib.uct.ac.za/dls/services/rdm/sharepublish. 2021.

Dmptool. dmptool: Build your Data Management Plan. https://dmptool.org/about_us. 2021.

DOI. Digital Object Identifier System. The DOI System. https://www.doi.org/index.html, 2021.

Ds-wizard. Data Stewardship Wizard. DSW. https://ds-wizard.org/, 2021.

Dublin Core. Dublin Core. Innovation in Metadata Design, Implementation & Best Practice: Dublin Core Metadata Initiative. https://dublincore.org/specifications/dublin-core/, 2021.

European Commission. H2020 Programme. Guidelines on FAIR Data Management in Horizon 2020. [link], 2016.

Fundação Oswaldo Cruz. Grupo de Trabalho em Ciência Aberta. Termo de Referência: Gestão e Abertura de Dados para Pesquisa na Fiocruz. https://www.arca.fiocruz.br/handle/icict/26803, 2018.

Henning, P. C. Não basta um Plano de Gestão de Dados: é preciso ser FAIR. In Icict (Ed.), Encontro da Rede Sudeste de Repositórios Institucionais (Issue 1). Icict. https://www.arca.fiocruz.br/handle/icict/33372, 2019.

Karimova, Y., Ribeiro, C., and David, G. Institutional Support for Data Management Plans: Five Case Studies. Metadata and Semantic Research: 14th International Conference, MTSR 2020, 1355, 308–319. https://doi.org/10.1007/978-3-030-71903-6_29, 2021.

Koers, H., Bangert, D., Hermans, E., van Horik, R., de Jong, M., and Mokrane, M. Recommendations for Services in a FAIR Data Ecosystem. Patterns, 1(5), 100058. https://doi.org/10.1016/j.patter.2020.100058, 2020.

Koutkias, V. From Data Silos to Standardized, Linked, and FAIR Data for Pharmacovigilance: Current Advances and Challenges with Observational Healthcare Data. Drug Safety, 42(5), 583–586. https://doi.org/10.1007/s40264-018-00793-z, 2019.

Lefebvre, A., Bakhtiari, B., and Spruit, M. Exploring research data management planning challenges in practice. It - Information Technology, 62(1), 29–37. https://doi.org/10.1515/itit-2019-0029, 2020.

Pasquetto, I. V., Randles, B. M., and Borgman, C. L. On the Reuse of Scientific Data. Data Science Journal, 16(8), 1–9. https://doi.org/10.5334/dsj-2017-008, 2017.

Pergl, R., Hooft, R., Suchánek, M., Knaisl, V., and Slifka, J. "Data Stewardship Wizard": A Tool Bringing Together Researchers, Data Stewards, and Data Experts around Data Management Planning. Data Science Journal, 18(1). https://doi.org/10.5334/dsj-2019-059, 2019.

Sayogo, D. S., and Pardo, T. A. Exploring the determinants of scientific data sharing: Understanding the motivation to publish research data. Government Information Quarterly, 30(SUPPL. 1), S19–S31. https://doi.org/10.1016/j.giq.2012.06.011, 2013.

Simms, S. R., and Jones, S. Next-Generation Data Management Plans: Global, Machine-Actionable, FAIR. International Journal of Digital Curation, 12(1), 36–45. https://doi.org/10.2218/ijdc.v12i1.513, 2017.

Veiga, V. S. de O., Henning, P., Dib, S., Penedo, E., Lima, J. D. C., Silva, L. O. B. da, and Pires, L. F. Plano de gestão de dados fair: uma proposta para a Fiocruz. Liinc Em Revista, 15(2), 275–286. https://doi.org/10.18617/liinc.v15i2.5030, 2019.

Wilkinson, M. D., Dumontier, M., Aalbersberg, Ij. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Finkers, R., Mons, B. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3(1), 160018. https://doi.org/10.1038/sdata.2016. 18, 2016.

Wilkinson, M. D., Verborgh, R., da Silva Santos, L. O. B., Clark, T., Swertz, M. A., Kelpin, F. D. L., Gray, A. J. G., Schultes, E. A., van Mulligen, E. M., Ciccarese, P., Kuzniar, A., Gavai, A., Thompson, M., Kaliyaperumal, R., Bolleman, J. T., and Dumontier, M. Interoperability and FAIRness through a novel combination of Web technologies. PeerJ Computer Science, 2017(4). https://doi.org/10.7717/peerj-cs. 110, 2017.

Williams, M., Bagwell, J., and Nahm Zozus, M. Data management plans: the missing perspective. Journal of Biomedical Informatics, 71, 130–142. https://doi.org/10.1016/j.jbi.2017.05.004, 2017.
Published
2021-09-01
ZAVALETA, Jorge; PINHEIRO, Annatércia; CERCEAU, Renato; LIMA, Cabral; CAMPOS, Maria Luiza Machado; DA CRUZ, Sérgio Manuel Serra. Creating Data Management Plans in Data Science Projects for Fake News Detection Supported by FAIR principles. In: REGIONAL SCHOOL ON INFORMATION SYSTEMS OF RIO DE JANEIRO (ERSI-RJ), 7. , 2021, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 72-79. DOI: https://doi.org/10.5753/ersirj.2021.16981.