sits.rep: Pesquisa Reprodutível em Classificações de Uso e Cobertura da Terra

Rafael Mariano; Gilberto Queiroz; Pedro Andrade; Rafael Santos

doi:10.5753/wcama.2020.11019

Rafael Mariano INPE
Gilberto Queiroz INPE
Pedro Andrade INPE
Rafael Santos INPE

DOI: https://doi.org/10.5753/wcama.2020.11019

Resumo

A reprodutibilidade de pesquisas tem sido um tópico de grande discussão na comunidade científica. Esta questão tem motivado os principais periódicos a elaborarem documentos de boas práticas que ajudam os pesquisadores a organizarem os dados, códigos e artefatos de suas publicações para assegurar a reprodução dos trabalhos. Por causa disto, diversas ferramentas computacionais têm sido desenvolvidas com o objetivo de lidar com as questões de reprodutibilidade científica. Este trabalho apresenta uma ferramenta tecnológica para se obter reprodutibilidade de experimentos científicos realizados na criação de mapas de uso e cobertura da terra baseadas em técnicas de aprendizado de máquina com o pacote R denominado sits (Satellite Image Time Series). Esta ferramenta, denominada sits.rep, auxilia pesquisadores em todos os passos de seus experimentos, aumentando a produtividade das equipes que desenvolvem códigos de classificações de uso e cobertura da terra, uma vez que os pesquisadores podem se dedicar exclusivamente em produzir melhores classificações.

Palavras-chave: Reprodutibilidade de pesquisas, mapas de uso e cobertura de terra, linguagem R

Referências

Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature News, 533(7604):452.

Beaulieu-Jones, B. K. and Greene, C. S. (2017). Reproducibility of computational workflows is automated using continuous analysis. Nature biotechnology, 35(4):342–346.

Camara, G., Assis, L. F., Ribeiro, G., et al. (2016). Big earth observation data analytics: Matching requirements to system architectures. In Proceedings of the 5th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial ’16, page 1–6, New York, NY, USA. Association for Computing Machinery.

Camara, G., Simoes, R., Andrade, P. R., et al. (2018). e-sensing/sits: Version 1.12.5.

Chirigati, F., Shasha, D., and Freire, J. (2013). Reprozip: Using provenance to support computational reproducibility. In Proceedings of the 5th USENIX Conference on Theory and Practice of Provenance, TaPP’13, page 1, USA. USENIX Association.

Di Tommaso, P., Chatzou, M., Floden, E.W., et al. (2017). Nextflow enables reproducible computational workflows. Nature biotechnology, 35(4):316–319.

Gentleman, R. and Lang, D. T. (2007). Statistical analyses and reproducible research. Journal of Computational and Graphical Statistics, 16(1):1–23.

Gertler, P., Galiani, S., and Romero, M. (2018). How to make replication the norm (vol 554, pg 417, 2018). Nature, 555(7698):580–580.

Gil, Y., David, C. H., Demir, I., et al. (2016). Toward the geoscience paper of the future: Best practices for documenting and sharing research from data to software to provenance. Earth and Space Science, 3(10):388–415.

Govoni, M., Munakami, M., Tanikanti, A., et al. (2019). Qresp, a tool for curating, discovering and exploring reproducible scientific papers. Scientific data, 6:190002.

Greff, K., Klein, A., Chovanec, M., et al. (2017). The sacred infrastructure for computational research. In Proceedings of the 15th Python in Science Conference (SciPy 2017), volume 28, pages 49–56.

Group on Earth Observations (2020). Group on earth observations - FAQ - Accessed: 2020-02-20. https://www.earthobservations.org/g_faq.html.

Ioannidis, J. P. A. (2005). Why most published research findings are false. PLOS Medicine, 2(8).

Landau, W. M. (2018). The drake r package: a pipeline toolkit for reproducibility and high-performance computing. Journal of Open Source Software, 3(21).

McNutt, M. (2014). Reproducibility. Science, 343(6168):229–229.

Nosek, B. A., Alter, G., Banks, G. C., et al. (2015). Promoting an open research culture. Science, 348(6242):1422–1425.

N¨ust, D., Konkol, M., Pebesma, E., et al. (2017). Opening the publication process with executable research compendia. D-Lib Magazine, 23(1/2).

Peng, R. D. (2011). Reproducible research in computational science. Science, 334(6060):1226–1227.

Prinz, F., Schlange, T., and Asadullah, K. (2011). Believe it or not: how much can we rely on published data on potential drug targets? Nature reviews Drug discovery, 10(9):712.

Soille, P., Burger, A., Marchi], D. D., et al. (2018). A versatile data-intensive computing platform for information retrieval from big geospatial data. Future Generation Computer Systems, 81:30 – 40.

Vasilevsky, N. A., Brush, M. H., Paddock, H., et al. (2013). On the reproducibility of science: unique identification of research resources in the biomedical literature. PeerJ, 1:e148.