A multi-level approach using deep learning and transfer learning for classifying non-coding RNAs

Resumo


Neste artigo, apresentamos uma nova abordagem para classificar RNAs não-codificadores (ncRNAs), combinando deep learning (DL) com transfer learning (TL) em uma abordagem multinível. No pré-treinamento, DL foi usado com dados de sete classes de ncRNAs, CD-box, HACA-box, scaRNA, miRNA, tRNA, 5S rRNA e 5.8S rRNA. No estudo de caso, TL foi usado para classificar riboswitches. Os dados de treinamento e teste foram cuidadosamente escolhidos, buscando sequências em árvores de espécies para maximizar a diversidade taxonômica. Esta abordagem foi comparada com outros métodos da literatura e os nossos resultados foram melhores para conjuntos de dados pequenos. Além disso, pode ser aplicado a outras classes de ncRNAs.

Palavras-chave: aprendizado de máquina, ncRNAs, CD-box, HACA-box, scaRNA, miRNA, tRNA, 5S rRNA, 5.8S rRNA, classificação de riboswitches

Referências

Ammunét, T. et al. (2022). Deep learning tools are top performers in long non-coding rna prediction. Briefings in Functional Genomics, 21(3):230–241.

Asim, M. N. et al. (2021). Advances in computational methodologies for classification and sub-cellular locality prediction of non-coding rnas. International Journal of Molecular Sciences, 22:1–43.

Asim, M. N. et al. (2020). A robust and precise convnet for small non-coding RNA classification (RPC-snRC). IEEE Access, 9:19379–19390.

Bansal, S. et al. (2024). Exploration of deep learning and transfer learning techniques in bioinformatics. In Applying Machine Learning Techniques to Bioinformatics: Few-Shot and Zero-Shot Methods, pages 238–257. IGI Global.

Beyene, S. S. et al. (2020). A novel riboswitch classification based on imbalanced sequences achieved by machine learning. PLoS computational biology, 16(7):e1007760.

Breaker, R. R. (2011). Prospects for riboswitch discovery and analysis. Molecular Cell, 43(6):867—-879.

Chantsalnyam, T. et al. (2020). ncRDeep: non-coding RNA classification with convolutional neural network. Computational Biology and Chemistry, 88:107364.

Chantsalnyam, T. et al. (2021). ncRDense: a novel computational approach for classification of non-coding RNA family by deep learning. Genomics, 113(5):3030–3038.

Chen, K. et al. (2023). ncDENSE: a novel computational method based on a deep learning framework for non-coding RNAs family prediction. BMC Bioinformatics, 24(1):68.

Federhen, S. (2012). The NCBI taxonomy database. Nucleic acids research, 40(D1):D136–D143.

Fiannaca, A. et al. (2017). nRC: non-coding RNA classifier based on structural features. BioData Mining, 10(1):1–18.

Geer, L. Y. et al. (2009). The ncbi biosystems database. Nucleic Acids Research, 38(suppl 1):D492–D496.

Kavita, K. and Breaker, R. R. (2023). Discovering riboswitches: the past and the future. Trends in Biochemical Sciences, 48(2):119–141.

Kipf, T. N. and Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.

LeCun, Y. et al. (2015). Deep learning. nature, 521(7553):436.

Leinster, T. and Meckes, M. W. (2016). Maximizing diversity in biology and beyond. Entropy, 18(3):88.

Liu, J. et al. (2006). Distinguishing protein-coding from non-coding RNAs through support vector machines. PLoS genetics, 2(4):e29.

Lorenz, R. et al. (2011). ViennaRNA package 2.0. Algorithms for Molecular Biology, 6:1–14.

McCown, P. J. et al. (2017). Riboswitch diversity and distribution. RNA, 23(7):995–1011.

Nawrocki, E. P. et al. (2009). Infernal 1.0: inference of RNA alignments. Bioinformatics, 25(10):1335–1337.

Olenginski, L. T. et al. (2024). Flipping the script: Understanding riboswitches from an alternative perspective. Journal of Biological Chemistry, 300(3):105730.

Oliveira, J., Costa, F., and Backofen, R. e. a. (2016). SnoReport 2.0: new features and a refined support vector machine to improve snoRNA identification. BMC Bioinformatics, 17(18):73–86.

Ontiveros-Palacios, N. et al. (2024). Rfam 15: RNA families database in 2025. Nucleic Acids Research, 53(D1):D258–D267.

Pardi, F. and Goldman, N. (2005). Species choice for comparative genomics: being greedy works. PLoS Genetics, 1(6):e71.

Premkumar, K. A. R. et al. (2020). Riboflow: Using deep learning to classify riboswitches with 99 accuracy. Frontiers in Bioengineering and Biotechnology, 8:808.

Rossi, E. et al. (2019). ncRNA classification with graph convolutional networks. arXiv preprint arXiv:1905.06515.

Sakamoto, T. et al. (2021). Taxallnomy: an extension of ncbi taxonomy that produces a hierarchically complete taxonomic tree. BMC bioinformatics, 22:1–23.

Singh, J. et al. (2021). Improved rna secondary structure and tertiary base-pairing prediction using evolutionary profile, mutational coupling and two-dimensional transfer learning. Bioinformatics, 37(17):2589–2600.

Stagno, J. R. and Wang, Y.-X. (2024). Riboswitch mechanisms for regulation of p1 helix stability. International Journal of Molecular Sciences, 25(19):10682.

Torrey, L. and Shavlik, J. (2010). Transfer learning. In Handbook of research on machine learning applications and trends: algorithms, methods, and techniques, pages 242–264. IGI global.

Wang, L. et al. (2020). ncRFP: a novel end-to-end method for non-coding RNAs family prediction based on deep learning. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 18(2):784–789.

Wang, L. et al. (2021). ncDLRES: a novel method for non-coding RNAs family prediction based on dynamic LSTM and ResNet. BMC Bioinformatics, 22:1–14.

Weiss, K. et al. (2016). A survey of transfer learning. Journal of Big data, 3(1):1–40.

Zhan, Z. et al. (2022). Evolutionary deep learning: a survey. Neurocomputing, 483:42–58.

Zhang, X. et al. (2022). Pinc: a tool for non-coding RNA identification in plants based on an automated machine learning framework. International Journal of Molecular Sciences, 23(19):11825.
Publicado
29/09/2025
COSTA, Mirele C. S. F.; RALHA, Célia G.; BRIGIDO, Marcelo M.; CARVALHO, André C. P. L. F.; STADLER, Peter F.; WALTER, Maria Emília M. T.. A multi-level approach using deep learning and transfer learning for classifying non-coding RNAs. In: SIMPÓSIO BRASILEIRO DE BIOINFORMÁTICA (BSB), 18. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 13-24. ISSN 2316-1248. DOI: https://doi.org/10.5753/bsb.2025.15104.