Desempenho de ferramentas genotípicas e stacking na predição de tropismo do subtipo C do HIV-1

  • Rita C. M. Soares UNIRIO
  • Letícia M. Raposo UNIRIO

Abstract


Several tools developed to classify the tropism of HIV-1 have been designed based on strains of subtype B and may not perform satisfactorily for other subtypes. The present study evaluated the performance of genotypic algorithms in predicting the tropism of HIV-1 subtype C and applied the stacking technique to seek a model with better performance. Raymond's Rule showed better overall performance, but Geno2Pheno 0.20 had greater sensitivity. The proposed model had performance equal to Geno2Pheno 0.10, with sensitivity and specificity greater than 90%. The stacking technique can be useful to improve the prediction of tropism without new tests.

References

CABRAL, G. B. Avaliação da resposta à terapia antirretroviral de resgate contendo antagonista do correceptor CCR5 em pessoas vivendo com HIV/AIDS. São Paulo, 2014. Disponível em: [http://ses.sp.bvs.br/lildbi/docsonline/get.php?id=6117]. Acesso em: 13 dez. 2019.

CASHIN, K.; GRAY, L. R.; HARVEY, K. L.; et al. Reliable Genotypic Tropism Tests for the Major HIV-1 Subtypes. Scientific Reports, v. 5, n. 1, p. 1–8, 2015.

CHARIF, D.; LOBRY, J. R. SeqinR 1.0-2: a contributed package to the R project for statistical computing devoted to biological sequences retrieval and analysis. In:

BASTOLLA, U.; PORTO, M.; ROMAN, H. E.; et al (Orgs.). Structural approaches to sequence evolution: Molecules, networks, populations. New York: Springer Verlag, 2007, p. 207–232. (Biological and Medical Physics, Biomedical Engineering).

CHAWLA, N. V.; BOWYER, K. W.; HALL, L. O.; et al. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, v. 16, p. 321–357, 2002.

CHIOU, S. H.; FREED, E. O.; PANGANIBAN, A. T.; et al. Studies on the role of the V3 loop in human immunodeficiency virus type 1 envelope glycoprotein function. AIDS research and human retroviruses, v. 8, n. 9, p. 1611–1618, 1992.

GHO - Global Health Observatory | By category | Number of people (all ages) living with HIV - Estimates by country. Disponível em: [ http://apps.who.int/gho/data/view.main.22100?lang=en ]. Acesso em: 25 out. 2019.

GRÄF, T.; PINTO, A. R. The increasing prevalence of HIV-1 subtype C in Southern Brazil and its dispersion through the continent. Virology, v. 435, n. 1, p. 170–178, 2013.

HEIDER, D.; DYBOWSKI, J. N.; WILMS, C.; et al. A simple structure-based model for the prediction of HIV-1 co-receptor tropism. BioData Mining, v. 7, p. 14, 2014.

JENSEN, M. A.; COETZER, M.; VAN’T WOUT, A. B.; et al. A Reliable Phenotype Predictor for Human Immunodeficiency Virus Type 1 Subtype C Based on Envelope V3 Sequences. Journal of Virology, v. 80, n. 10, p. 4698–4704, 2006.

KAWASHIMA, S.; OGATA, H.; KANEHISA, M. AAindex: Amino Acid Index Database. Nucleic Acids Research, v. 27, n. 1, p. 368–369, 1999.

KUHN, M.; WING, J.; WESTON, S.; et al. caret: Classification and Regression Training. [s.l.: s.n.], 2019. Disponível em: [ https://CRAN.R-project.org/package=caret ]. Acesso em: 23 jun. 2020.

LENGAUER, T.; SANDER, O.; SIERRA, S.; et al. Bioinformatics prediction of HIV

coreceptor usage. Nature Biotechnology, v. 25, n. 12, p. 1407–1410, 2007.

Los Alamos. HIV Databases. Disponível em: [ https://www.hiv.lanl.gov/content/index]. Acesso em: 14 dez. 2019.

MASSO, M.; VAISMAN, I. I. Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage. BMC Bioinformatics, v. 11, p. 494, 2010.

MENEZES, R.; RAPOSO, L. HIV Tropism Ensemble Methods. 2020. Disponível em: [https://doi.org/10.5281/zenodo.3905343]. Acesso em: 23 jun. 2020.

OZA, N. C.; TUMER, K. Classifier ensembles: Select real-world applications. Information Fusion, v. 9, n. 1, p. 4–20, 2008.

PAGÈS, H.; ABOYOUN, P.; GENTLEMAN, R.; et al. Biostrings: Efficient manipulation of biological strings. [s.l.: s.n.], 2019.

POWERS, D. M. W. Evaluation: From precision, recall and f-measure to roc., informedness, markedness & correlation. Journal of Machine Learning Technologies, v. 2, n. 1, p. 37–63, 2011.

R CORE TEAM. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing, 2019. Disponível em: [ https://www.R-project.org/ ]. Acesso em: 23 jun. 2020.

RAYMOND, S.; DELOBEL, P.; MAVIGNER, M.; et al. Correlation between genotypic predictions based on V3 sequences and phenotypic determination of HIV-1 tropism. AIDS (London, England), v. 22, n. 14, p. F11-16, 2008.

RIEMENSCHNEIDER, M.; CASHIN, K. Y.; BUDEUS, B.; et al. Genotypic Prediction of Co-receptor Tropism of HIV-1 Subtypes A and C. Scientific Reports, v. 6, n. 1, p. 1–9, 2016.

SWENSON, L. C.; DÄUMER, M.; PAREDES, R. Next-generation sequencing to assess HIV tropism. Current opinion in HIV and AIDS, v. 7, n. 5, p. 478–485, 2012.

TORGO, L. Data Mining with R, learning with case studies. [s.l.]: Chapman and Hall/CRC, 2010. Disponível em: [ http://www.dcc.fc.up.pt/ltorgo/DataMiningWithR ]. Acesso em: 23 jun. 2020.

WICKHAM, H. stringr: Simple, Consistent Wrappers for Common String Operations. [s.l.: s.n.], 2019. Disponível em: [h ttps://CRAN.R-project.org/package=stringr ]. Acesso em: 23 jun. 2020.
Published
2020-09-15
SOARES, Rita C. M.; RAPOSO, Letícia M.. Desempenho de ferramentas genotípicas e stacking na predição de tropismo do subtipo C do HIV-1. In: UNDERGRADUATE RESEARCH WORKS CONTEST - BRAZILIAN SYMPOSIUM ON COMPUTING APPLIED TO HEALTHCARE (SBCAS), 20. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 99-104. ISSN 2763-8987. DOI: https://doi.org/10.5753/sbcas.2020.11565.