Comparação de Classificadores de Aprendizado de Máquina para Modelagem de Distribuição de Espécies: um estudo de caso na Bacia Amazônica

  • Renato O. Miyaji USP
  • Felipe V. de Almeida USP
  • Pedro L. P. Corrêa USP
  • Luciana V. Rizzo UNIFESP
  • Alan Calheiros INPE
  • Márcio Teixeira USP

Abstract


In Ecology, Species Distribution Modeling is commonly performed to analyze the influence of atmospheric and meteorological variables on the occurrence of species. In the last decades, Machine Learning Classifiers have been successfully applied for this task. Therefore, this article aimed to compare the feasibility of seven Machine Learning techniques for Species Distribution Modeling. These were applied for a study case of the occurrence of birds in the central region of the Amazon Basin, near Manaus (AM), using data collected by the GoAmazon 2014/15 project. The classifier with the best ROC-AUC was Gradient Boosting with 94%. Maximum Entropy Model obtained the best Recall with 85%. Random Forests presented a good performance for both metrics.

References

Amado, M. E. V., Grütter, R., Fischer, C., Suter, S., and Bernstein, A. (2020). Free-ranging wild boar (sus scrofa) in switzerland: Casual observations and model-based projections during open and closed season for hunting. Schweiz Arch Tierheilkd, 162(6):365–376.

Araujo, M. B., Anderson, R. P., Barbosa, M. A., Beale, C. M., Dormann, C. F., Early, R., Garcia, R. A., Guisan, A., Maiorano, L., Naimi, B., O’Hara, R. B., Zimmermann, N. E., and Rhabek, C. (2019). Standards for distribution models in biodiversity assessments. Science Advances, 5.

Breiman, L. (2001). Random forests. Machine Learning, 45:5–32.

Carter, S., van Rees, C. B., Hand, B. K., Muhlfeld, C. C., Luikart, G., and Kimball, J. S. (2021). Testing a generalizable machine learning workflow for aquatic invasive species on rainbow trout (oncorhynchus mykiss) in northwest montana. Frontiers in Big Data, 4.

Derville, S., Torres, L. G., Iovan, C., and Garrigue, C. (2018). Finding the right fit: Comparative cetacean distribution models using multiple data sources and statistical approaches. Diversity and Distributions, 24:1657–1673.

Effrosynidis, D., Tsikliras, A., Arampatzis, A., and Sylaios, G. (2020). Species distribution modelling via feature engineering and machine learning for pelagic fishes in the mediterranean sea. Applied Sciences, 10(24).

Elith, J. and Leathwick, J. R. (2009). Species distribution models: Ecological explanation and prediction across space and time. The Annual Review of Ecology, Evolution and Systematics, 40:677–697.

Fern, R. R., Morrison, M. L., Grant, W. E., Wang, H., and Campbell, T. A. (2020). Modeling the influence of livestock grazing pressure on grassland bird distributions. Ecological Processes, 9(42).

Georgian, S., Morgan, L., and Wagner, D. (2021). The modeled distribution of corals and sponges surrounding the salas y gómez and nazca ridges with implications for high seas conservation. Peer J, 9.

Ghareghan, F., Ghanbarian, G., Pourghasemi, H. R., and Safaeian, R. (2020). Prediction of habitat suitability of morina persica l. species using artificial intelligence techniques. Ecological Indicators, 112.

Hegel, T. M., Cushman, A., Evans, J., and Huetmann, F. (2010). Spatial Complexity, Informatics and Wildlife Conservation, chapter Current State of the Art for Statistical Modelling of Species Distributions. Springer.

Hernandez, P. A., Graham, C. H., Master, L. L., and Albert, D. L. (2006). The effect of sample size and species characteristics on performance of different species distribution modeling methods. Ecography, 29(5):773–785.

Hutchinson, G. E. (1991). Population studies: Animal ecology and demography. Bulletin of Mathematical Biology, 53(1-2):193–213.

Johnson, R., Chawla, N., and Hellmann, J. (2012). Species distribution modeling and prediction: A class imbalance problem. pages 9–16.

Martin, S. T., Artaxo, P., Machado, L., Manzi, A. O., Souza, R. A. F. d., Schumacher, C., Wang, J., Biscaro, T., Brito, J., Calheiros, A., et al. (2017). The green ocean amazon experiment (goamazon2014/5) observes pollution affecting gases, aerosols, clouds, and rainfall over the rain forest. Bulletin of the American Meteorological Society, 98(5):981–997.

Mateo, R. G., Vanderpoorten, A., Muñoz, J., Laenen, B., and Désamoré, A. (2013). Modeling species distributions from heterogeneous data for the biogeographic regionalization of the european bryophyte flora. PLoS One, 8(2):e55648.

Miyaji, R. O., Almeida, F. V., Bauer, L. O., Ferrari, V., Corrêa, P. L. P., Rizzo, L. V., and Prakash, G. (2021). Spatial interpolation of air pollutant and meteorological variables in central amazonia. Data, 6(12).

Nurhussen, A., Atzberger, C., and Zewdia, W. (2021). Species distribution modelling performance and its implication for sentinel-2-based prediction of invasive prosopis juliflora in lower awash river basin, ethiopia. Ecological Processes, 10(18).

Pinaya, J. and Corrêa, P. (2014). Metodologia para definição das atividades do processo de modelagem de distribuição de espécies. In Anais do V Workshop de Computação Aplicada a Gestão do Meio Ambiente e Recursos Naturais, pages 45–54, Porto Alegre, RS, Brasil.

Rahman, M. S., Pietong, C., Zafar, S., Ekalasananan, T., Paul, R. E., Haque, U., Rocklöv, J., and Overgaard, H. J. (2021). Mapping the spatial distribution of the dengue vector aedes aegypti and predicting its abundance in northeastern thailand using machine-learning approach. One Health, 13.

The Imbalanced-learn Developers (2021). Imbalanced-learn documentation. https://imbalanced-learn.org/stable/. Acesso em: 18/08/2022.
Published
2023-08-06
MIYAJI, Renato O.; ALMEIDA, Felipe V. de; CORRÊA, Pedro L. P.; RIZZO, Luciana V.; CALHEIROS, Alan; TEIXEIRA, Márcio. Comparação de Classificadores de Aprendizado de Máquina para Modelagem de Distribuição de Espécies: um estudo de caso na Bacia Amazônica. In: WORKSHOP ON COMPUTING APPLIED TO THE MANAGEMENT OF THE ENVIRONMENT AND NATURAL RESOURCES (WCAMA), 14. , 2023, João Pessoa/PB. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 41-50. ISSN 2595-6124. DOI: https://doi.org/10.5753/wcama.2023.229365.