An Approach to HLA Allele Imputation in Bone Marrow Donor Registries
Resumo
As principais informações dos registros de doadores de medula óssea são os alelos dos genes HLA. Em função dos custos e tipos dos exames necessários para se obter essas informações, muitos desses alelos não se encontram nos banco de dados. Assim, o objetivo deste trabalho é, de forma inédita, avaliar a possibilidade de imputar os alelos dos genes não informados nesses bancos de dados. Para mitigar essas lacunas, foram investigados algoritmos baseados em Rede Neural Recorrente do tipo Long-Short Time Memory (LSTM). A acurácia de 76% mostra a viabilidade de imputar os alelos faltantes, apesar do forte desbalanceamento das classes e por se tratar de uma das regiões mais polimórificas do DNA humano (i.e. muitas opções de alelos distintos).Referências
Al-lQubaydhi, N., Alenezi, A., Alanazi, T., Senyor, A., Alanezi, N., Alotaibi, B., Alotaibi, M., Razaque, A., and Hariri, S. (2024). Deep learning for unmanned aerial vehicles detection: A review. Computer Science Review, 51:100614.
Alexander Dilthey, Stephen Leslie, L. M.-J. S.-C. C. M. R. N. G. M. (2013). Multi-population classical hla type imputation. PLoS Comput. Bio., 9(2):e1002877.
Alexander T Dilthey, Loukas Moutsianas, S. L. G. M. (2011). Hla*imp—an integrated framework for imputing classical hla alleles from snp genotypes. Bioinformatics, 27(7):968–972.
Geffard, E. et al. (2019). Easy-HLA: a validated web application suite to reveal the full details of HLA typing. Bioinformatics, 36(7):2157–2164.
Geffard, E., Limou, S., Walencik, A., Daya, M., Watson, H., Torgerson, D., Barnes, K. C., CAAPA, Cesbron Gautier, A., Gourraud, P.-A., et al. (2020). Easy-hla: a validated web application suite to reveal the full details of hla typing. Bioinformatics, 36(7):2157–2164.
Hancock, J.T., K. T. (2020). Survey on categorical data for neural networks. J Big Data, 7:28.
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8):1735–1780.
Hrinchuk, O., Khrulkov, V., Mirvakhabova, L., Orlova, E., and Oseledets, I. (2020). Tensorized embedding layers. In Cohn, T., He, Y., and Liu, Y., editors, Findings of the Association for Computational Linguistics: EMNLP 2020, Online.
Instituto Nacional de Câncer (INCA) (2023). Quem somos. Acesso em: 03 jan. 2025.
Jeanmougin, M., Noirel, J., Coulonges, C., and Zagury, J.-F. (2017). Hla-check: evaluating hla data from snp information. BMC bioinformatics, 18:1–8.
Junjie Chen, X. S. (2019). Sparse convolutional denoising autoencoders for genotype imputation. Genes, 10(9):652.
Kishore, A. and Petrek, M. (2018). Next-generation sequencing based hla typing: deciphering immunogenetic aspects of sarcoidosis. Frontiers in genetics, 9:503.
Lhotte, R., Letort, V., Usureau, C., Jorge-Cordeiro, D., Consortium, P. A., Siemowski, J., Gabet, L., Cournede, P.-H., Taupin, J.-L., Guillaume, N., et al. (2024). Improving hla typing imputation accuracy and eplet identification with local next-generation sequencing training data. HLA, 103(1):e15222.
Maiers, M., Halagan, M., Gragert, L., Bashyal, P., Brelsford, J., Schneider, J., Lutsker, P., and Louzoun, Y. (2019). Grimm: Graph imputation and matching for hla genotypes. Bioinformatics, 35(18):3520–3523.
Shaz, B. H., Hillyer, C. D., and Gil, M. R. (2013). Blood Banking and Transfusion Medicine - History, Industry, and Discipline.
Song, M., Greenbaum, J., Luttrell, J., Zhou, W., Wu, C., Luo, Z., et al. (2022). An autoencoder-based deep learning method for genotype imputation. Frontiers in Artificial Intelligence, 5.
Stephen Leslie, Peter Donnelly, G. M. (2008). A statistical method for predicting classical hla alleles from snp data. American Journal of Human Genetics, 82(1):48–56.
Tiercy, J.-M. (2016). How to select the best available related or unrelated donor of hematopoietic stem cells? Haematologica, 101(6):680–687.
Torres, M. A. and Moraes, M. E. H. (2011). Nomenclatura dos fatores do sistema hla. einstein (São Paulo), 9:249–251.
Xiaoming Jia, Buhm Han, S. O.-G. W.-M. C. P. J. C. S. S. R. S. R. P. I. W. d. B. (2013). Imputing amino acid polymorphisms in human leukocyte antigens. PLoS One.
Yu, Y., Si, X., Hu, C., and Zhang, J. (2019). A review of recurrent neural networks: Lstm cells and network architectures. Neural Computation, 31(7):1235–1270.
Alexander Dilthey, Stephen Leslie, L. M.-J. S.-C. C. M. R. N. G. M. (2013). Multi-population classical hla type imputation. PLoS Comput. Bio., 9(2):e1002877.
Alexander T Dilthey, Loukas Moutsianas, S. L. G. M. (2011). Hla*imp—an integrated framework for imputing classical hla alleles from snp genotypes. Bioinformatics, 27(7):968–972.
Geffard, E. et al. (2019). Easy-HLA: a validated web application suite to reveal the full details of HLA typing. Bioinformatics, 36(7):2157–2164.
Geffard, E., Limou, S., Walencik, A., Daya, M., Watson, H., Torgerson, D., Barnes, K. C., CAAPA, Cesbron Gautier, A., Gourraud, P.-A., et al. (2020). Easy-hla: a validated web application suite to reveal the full details of hla typing. Bioinformatics, 36(7):2157–2164.
Hancock, J.T., K. T. (2020). Survey on categorical data for neural networks. J Big Data, 7:28.
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8):1735–1780.
Hrinchuk, O., Khrulkov, V., Mirvakhabova, L., Orlova, E., and Oseledets, I. (2020). Tensorized embedding layers. In Cohn, T., He, Y., and Liu, Y., editors, Findings of the Association for Computational Linguistics: EMNLP 2020, Online.
Instituto Nacional de Câncer (INCA) (2023). Quem somos. Acesso em: 03 jan. 2025.
Jeanmougin, M., Noirel, J., Coulonges, C., and Zagury, J.-F. (2017). Hla-check: evaluating hla data from snp information. BMC bioinformatics, 18:1–8.
Junjie Chen, X. S. (2019). Sparse convolutional denoising autoencoders for genotype imputation. Genes, 10(9):652.
Kishore, A. and Petrek, M. (2018). Next-generation sequencing based hla typing: deciphering immunogenetic aspects of sarcoidosis. Frontiers in genetics, 9:503.
Lhotte, R., Letort, V., Usureau, C., Jorge-Cordeiro, D., Consortium, P. A., Siemowski, J., Gabet, L., Cournede, P.-H., Taupin, J.-L., Guillaume, N., et al. (2024). Improving hla typing imputation accuracy and eplet identification with local next-generation sequencing training data. HLA, 103(1):e15222.
Maiers, M., Halagan, M., Gragert, L., Bashyal, P., Brelsford, J., Schneider, J., Lutsker, P., and Louzoun, Y. (2019). Grimm: Graph imputation and matching for hla genotypes. Bioinformatics, 35(18):3520–3523.
Shaz, B. H., Hillyer, C. D., and Gil, M. R. (2013). Blood Banking and Transfusion Medicine - History, Industry, and Discipline.
Song, M., Greenbaum, J., Luttrell, J., Zhou, W., Wu, C., Luo, Z., et al. (2022). An autoencoder-based deep learning method for genotype imputation. Frontiers in Artificial Intelligence, 5.
Stephen Leslie, Peter Donnelly, G. M. (2008). A statistical method for predicting classical hla alleles from snp data. American Journal of Human Genetics, 82(1):48–56.
Tiercy, J.-M. (2016). How to select the best available related or unrelated donor of hematopoietic stem cells? Haematologica, 101(6):680–687.
Torres, M. A. and Moraes, M. E. H. (2011). Nomenclatura dos fatores do sistema hla. einstein (São Paulo), 9:249–251.
Xiaoming Jia, Buhm Han, S. O.-G. W.-M. C. P. J. C. S. S. R. S. R. P. I. W. d. B. (2013). Imputing amino acid polymorphisms in human leukocyte antigens. PLoS One.
Yu, Y., Si, X., Hu, C., and Zhang, J. (2019). A review of recurrent neural networks: Lstm cells and network architectures. Neural Computation, 31(7):1235–1270.
Publicado
29/09/2025
Como Citar
EDUARDO, Felipe S. C.; AZEVEDO, Nathalia de; PÔRTO, Luís Cristóvão M. S.; FIGUEIREDO, Karla; SENA, Alexandre C..
An Approach to HLA Allele Imputation in Bone Marrow Donor Registries. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 22. , 2025, Fortaleza/CE.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 831-842.
ISSN 2763-9061.
DOI: https://doi.org/10.5753/eniac.2025.14236.
