Performance Analysis of Deep Neural Networks in piRNAs Classification

Alisson Hayasi da Costa; Renato Augusto Corrêa dos Santos; Ricardo Cerri

doi:10.5753/eniac.2018.4449

Alisson Hayasi da Costa UFSCAR
Renato Augusto Corrêa dos Santos USP
Ricardo Cerri UFSCAR

DOI: https://doi.org/10.5753/eniac.2018.4449

Resumo

Modern machine learning techniques, such as Deep Learning, have been successful in many complex Bioinformatics tasks. The capacity of Deep Neural Networks to handle large volumes of data has made them essential tools for multiple areas of knowledge. However, developing the best model for a given task is a hard work. Deep Neural Networks have a very large number of hyperparameters, making them as powerful as complex to be adjusted. Therefore, in order to better understand the behavior of Deep Neural Networks when applied to biological data, we present in this paper a performance analysis of a Deep Feedforward Network in piRNAs classification. Different configurations of activation functions, initialization of weights, number of layers and learning rate are experienced. The effects of different hyperparameters are discussed and certain organizations are proposed for similar domains of data.

Referências

Angermueller, C., Pärnamaa, T., Parts, L., and Stegle, O. (2016). Deep learning for computational biology. Molecular Systems Biology, 12(7).

Assumpcao, C. B., Calcagno, D. Q., Araujo, T. M., Santos, S. E., Santos, A. K., Riggins, G. J., Burbano, R. R., and Assumpcao, P. P. (2015). The role of piRNA and its potential clinical implications in cancer. Epigenomics, 7(6):975–984.

Chollet, F. et al. (2015). Keras. https://keras.io.

Deng, L. and Yu, D. (2014). Deep Learning: Methods and Applications, volume 7.

Duchi, J., Hazan, E., and Singer, Y. (2010). Adaptive subgradient methods for online learning and stochastic optimization. Technical Report UCB/EECS-2010-24, EECS Department, University of California, Berkeley.

Gama, J., Faceli, K., Lorena, A., and De Carvalho, A. (2011). Inteligência artificial: uma abordagem de aprendizado de máquina. Grupo Gen - LTC.

Glorot, X. and Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. 9:249–256.

Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press. http://www.deeplearningbook.org.

Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. (2009). The WEKA data mining software: an update. SIGKDD Explorations, 11(1):10–18.

Han, B. W. and Zamore, P. D. (2014). pirnas. Current Biology, 24(16):R730 – R733.

Haykin, S. S. (2009). Neural networks and learning machines. Pearson Education, Upper Saddle River, NJ, third edition.

He, K., Zhang, X., Ren, S., and Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification.

Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines vinod nair.

Hirakata, S. and Siomi, M. C. (2016). piRNA biogenesis in the germline: From transcription of piRNA genomic sources to piRNA maturation. Biochim. Biophys. Acta, 1859(1):82–92.

Iwasaki, Y. W., Siomi, M. C., and Siomi, H. (2015). Piwi-interacting rna: Its biogenesis and functions. Annual Review of Biochemistry, 84(1):405–433. PMID: 25747396.

Li, D., Luo, L., Zhang, W., Liu, F., and Luo, F. (2016). A genetic algorithm-based weighted ensemble method for predicting transposon-derived pirnas. BMC Bioinformatics, 17(1):329.

Liu, B., Wu, H., and Chou, K.-C. (2017). Pse-in-one 2.0: An improved package of web servers for generating various modes of pseudo components of dna, rna, and protein sequences. 09:67–91.

Luo, L., Li, D., Zhang, W., Tu, S., Zhu, X., and Tian, G. (2016). Accurate prediction of transposon-derived pirnas by integrating various sequential and physicochemical features. PloS one, 11(4):e0153268.

Min, S., Lee, B., and Yoon, S. (2017). Deep learning in bioinformatics. Briefings in Bioinformatics, 18(5):851–869.

Moyano, M. and Stefani, G. (2015). piRNA involvement in genome stability and human cancer. J Hematol Oncol, 8:38.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.

Schmidhuber, J. (2014). Deep learning in neural networks: An overview. CoRR, abs/1404.7828.

Specchia, V. (2017). pirnas: the bodyguards of fertility. Journal of RNA and Genomics, 13(1).

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15:1929–1958.

Theano Development Team (2016). Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints, abs/1605.02688.

Travis E, O. (2006). A guide to numpy. [Online; accessed ¡today¿].

van Doorn, J. (2014). Analysis of deep convolutional neural network architectures.

Wang, K., Liang, C., Liu, J., Xiao, H., Huang, S., Xu, J., and Li, F. (2014). Prediction of pirnas using transposon interaction and a support vector machine. BMC Bioinformatics, 15(1):419.

Zhang, Y., Wang, X., and Kang, L. (2011). A k-mer scheme to predict pirnas and characterize locust pirnas. Bioinformatics, 27(6):771–776.

ART 52

Azevedo, L. L. (2007). AProSiMA - ambiente de resolução cooperativa de problemas baseado em simulação multiagentes. PhD thesis, Programa de Doutorado em Engenharia Elétrica, UFES, Vitória.

Borbely et al. (1984). Timing of human sleep: recovery process gated by a circadian pacemaker. American Journal of Physiology-Regulatory, Integrative and Comparative Physiology, 246(2):R161–R183.

Borbely, A. A. and Achermann, P. (1999). Sleep homeostasis and models of sleep regulation. Journal of biological rhythms, 14(6):559–570.

Chapman, C. R., Casey, K., Dubner, R., Foley, K., Gracely, R., and Reading, A. (1985). Pain measurement: an overview. Pain, 22(1):1–31.

Simoneau, J. L. et al. (1985). Human skeletal muscle fiber type alteration with highintensity intermittent training. European journal of applied physiology and occupational physiology, 54(3):250–253.

Skeldon, A. (2014). Are you listening to your body clock? http://personal.maths.surrey.ac.uk/st/A.Skeldon/sleep.html. [Online; accessed 20-January -2018].

So, Y. and Durfee, E. H. (1996). Designing tree-structured organizations for computational agents. Computational & Mathematical Organization Theory, 2(3):219–245.