Gene Networks Inference by Reinforcement Learning

Rodrigo Cesar Bonini; David Correa Martins-Jr

Rodrigo Cesar Bonini UFABC
David Correa Martins-Jr UFABC

Resumo

Gene Regulatory Networks inference from gene expression data is an important problem in systems biology field, involving the estimation of gene-gene indirect dependencies and the regulatory functions among these interactions to provide a model that explains the gene expression dataset. The main goal is to comprehend the global molecular mechanisms underlying diseases for the development of medical treatments and drugs. However, such a problem is considered an open problem, since it is difficult to obtain a satisfactory estimation of the dependencies given a very limited number of samples subject to experimental noises. Many gene networks inference methods exist in the literature, where some of them use heuristics or model based algorithms to find interesting networks that explain the data by codifying whole networks as solutions. However, in general, these models are slow, not scalable to real sized networks (thousands of genes), or require many parameters, the knowledge from an specialist or a large number of samples to be feasible. Reinforcement Learning is an adaptable goal oriented approach that does not require large labeled datasets and many parameters; can give good quality solutions in a feasible execution time; and can work automatically without the need of a specialist for a long time. Therefore, we here propose a way to adapt Reinforcement Learning to the Gene Regulatory Networks inference domain in order to get networks with quality comparable to one achieved by exhaustive search, but in much smaller execution time. Our experimental evaluation shows that our proposal is promising in learning and successfully finding good solutions across different tasks automatically in a reasonable time. However, scalabilty to networks with thousands of genes remains as limitation of our RL approach due to excessive memory consuming, although we foresee some possible improvements that could deal with this limitation in future versions of our proposed method.

Palavras-chave: reinforcement learning, gene regulatory networks inference, boolean networks

Referências

Akutsu, T., Miyano, S., Kuhara, S., et al.: Identification of Genetic Networks from A Small Number of Gene Expression Patterns under The Boolean Network Model. In: Proceedings of the Pacific Symposium on Biocomputing (PSB). vol. 4, pp. 17–28 (1999)

Anastassiou, D.: Computational analysis of the synergy among multiple interacting genes. Molecular Systems Biology 3(83) (2007)

Barrera, J., Cesar-Jr, R.M., Martins-Jr, D.C., Vencio, R.Z.N., Merino, E.F., Yamamoto, M.M., Leonardi, F.G., Pereira, C.A.B., del Portillo, H.A.: Constructing probabilistic genetic networks of Plasmodium falciparum from dynamical expression signals of the intraerythrocytic development cycle. In: Methods of Microarray Data Analysis V, chap. 2, pp. 11–26. Springer (2007). https://doi.org/10.1007/978-0-387-34569-7_2

Bonini, R., Da Silva, F.L., Glatt, R., Spina, E., Costa, A.H.R.: A framework to discover and reuse object-oriented options in reinforcement learning. In: 2018 7th Brazilian Conference on Intelligent Systems (BRACIS). pp. 109–114. IEEE (2018)

Bonini, R.C., Silva, F.L., Spina, E., Costa, A.H.R.: Using options to accelerate learning of new tasks according to human preferences. In: AAAI Workshop Human-Machine Collaborative Learning. pp. (1–8) (2017)

Brazhnik, P., Fuente, A., Mendes, P.: Gene networks: how to put the function in genomics. Trends in Biotechnology 20(11), 467–472 (2002)

Cover, T.M., Van-Campenhout, J.M.: On The Possible Orderings in The Measurement Selection Problem. IEEE Transactions on Systems, Man and Cybernetics 7(9), 657–661 (1977)

Da Silva, F.L., Nishida, C.E., Roijers, D.M., Costa, A.H.R.: Coordination of electric vehicle charging through multiagent reinforcement learning. IEEE Transactions on Smart Grid 11(3), 2347–2356 (2019)

De Jong, H.: Modeling and simulation of genetic regulatory systems: a literature review. Journal of computational biology 9(1), 67–103 (2002)

D’haeseleer, P., Liang, S., Somgyi, R.: Tutorial: Gene expression data analysis and modeling. In: Pacific Symposium on Biocomputing. Hawaii (January 1999)

Dougherty, E.R., Xiao, Y.: Design of probabilistic boolean networks under the requirement of contextual data consistency. IEEE Transactions on Signal Processing 54(9), 3603–3613 (2006)

Eberwine, J., Sul, J., Bartfai, T., Kim, J.: The promise of single-cell sequencing. Nature Methods 11, 25–27 (2014)

Erdös, P., Rényi, A.: On random graphs. Publ. Math. Debrecen 6, 290–297 (1959)

Hecker, M., Lambeck, S., Toepfer, S., van Someren, E., Guthke, R.: Gene regulatory network inference: data integration in dynamic models-a review. Biosystems 96, 86–103 (2009)

Jacomini, R.S., Martins-Jr, D.C., Silva, F.L., Costa, A.H.R.: GeNICE: A novel framework for gene network inference by clustering, exhaustive search, and multivariate analysis. Journal of Computational Biology 24(8) (2017)

Jimenez, R.D., Martins-Jr, D.C., Santos, C.S.: One genetic algorithm per gene to infer gene networks from expression data. Network Modeling Analysis in Health Informatics and Bioinformatics 4, 1–22 (2015)

Kauffman, S.A.: Homeostasis and differentiation in random genetic control networks. Nature 224(215), 177–178 (1969)

Liang, S., Fuhrman, S., Somogyi, R.: Reveal, a general reverse engineering algorithm for inference of genetic network architectures. In: Pacific Simposium on Biocomputing. vol. 3, pp. 18–29 (1998)

Lopes, F.M., Martins-Jr, D.C., Barrera, J., Cesar-Jr, R.M.: A feature selection technique for inference of graphs from their known topological properties: revealing scale-free gene regulatory networks. Information Sciences 272, 1–15 (2014)

Marbach, D., Prill, R.J., Schaffter, T., Mattiussi, C., Floreano, D., Stolovitzky, G.: Revealing strengths and weaknesses of methods for gene network inference. Proceeings of the National Academy of Sciences 107(14), 6286–6291 (2010)

Martins-Jr, D.C., Braga-Neto, U., Hashimoto, R.F., Dougherty, E.R., Bittner, M.L.: Intrinsically multivariate predictive genes. IEEE Journal of Selected Topics in Signal Processing 2(3), 424–439 (2008)

Nam, D., Seo, S., Kim, S.: An Efficient Top-down Search Algorithm for Learning Boolean Networks of Gene Expression. Machine Learning 65, 229–245 (2006)

Pratapa, A., Jalihal, A.P., Law, J.N., Bharadwaj, A., Murali, T.: Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nature methods 17(2), 147–154 (2020)

Shalon, D., Smith, S.J., Brown, P.O.: A dna microarray system for analyzing complex dna samples using two-color fluorescent probe hybridization. Genome Res pp. 639–45 (1996)

Shmulevich, I., Dougherty, E.R., Kim, S., Zhang, W.: Probabilistic boolean networks: a rule-based uncertainty model for gene regulatory networks. Bioinformatics 18(2), 261–274 (2002)

Silva, F.L., Taylor, M.E., Costa, A.H.R.: Autonomously Reusing Knowledge in Multiagent Reinforcement Learning. In: IJCAI (2018)

Snoep, J.L., Westerhoff, H.V.: From isolation to integration, a systems biology approach for building the silicon cell. Topics in Current Genetics 13, 13–30 (2005)

Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, USA, 1st edn. (1998)

Velculescu, V.E., Zhang, L., Vogelstein, B., Kinzler, K.W.: Serial analysis of gene expression. Science 270, 484–487 (1995)

Wang, Z., Gerstein, M., Snyder, M.: Rna-seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10(1), 57–63 (2009)

Watkins, C.J., Dayan, P.: Q-learning. Machine learning 8(3), 279–292 (1992)

Yerudkar, A., Chatzaroulas, E., Del Vecchio, C., Moschoyiannis, S.: Sampled-data control of probabilistic boolean control networks: A deep reinforcement learning approach. Information Sciences 619, 374–389 (2023)

Zhang, Y., Chang, X., Liu, X.: Inference of gene regulatory networks using pseudotime series data. Bioinformatics 37(16), 2423–2431 (2021)