Streaming, Distributed, and Asynchronous Amortized Inference
Abstract
We address the problem of amortized inference over a compositional finite space in a non-localized environment, i.e., when data are observed in a distributed, streaming, or mixed (asynchronous) fashion. This setting comprises applications in causal discovery, phylogenetic inference, natural language processing, and other problems. In particular, we focus on Generative Flow Networks (GFlowNets), an emergent family of deep generative models that cast amortized inference as finding a balanced flow assignment in a flow network. To accomplish this, a GFlowNet parameterizes the flow function as a neural network and optimizes its parameters via stochastic gradient descent. Drawing on this, we make both practical and theoretical contributions. On the practical side, we introduce three algorithms — Streaming Bayes (SB) GFlowNets, Embarrassingly Parallel (EP) GFlowNets, and Subgraph Asynchronous Learning (SAL) — along with efficient gradient estimators that significantly accelerate GFlowNet training when compared against traditional approaches. Also, we develop the first computationally amenable and sound metric for assessing the correctness of a trained GFlowNet. From a theoretical perspective, we delineate the limitations and present the first non-vacuous generalization guarantees for the learning of GFlowNets. All in all, our work paves the road for a better understanding, usability, and fair assessment of amortized inference algorithms. This extended abstract provides an overview of our research, which was published at the proceedings of ICML [da Silva et al. 2024c], NeurIPS [da Silva et al. 2024a, da Silva et al. 2024b], and ICLR [da Silva et al. 2025b, da Silva et al. 2025a].References
Ahmadian, A., Cremer, C., Gallé, M., Fadaee, M., Kreutzer, J., Pietquin, O., Üstün, A., and Hooker, S. (2024). Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs.
Association for Computing Machinery (2025). 2024 Turing Award. Accessed: 2025-03-15.
Atanackovic, L. and Bengio, E. (2024). Investigating generalization behaviours of generative flow networks.
Bengio, E., Jain, M., Korablyov, M., Precup, D., and Bengio, Y. (2021). Flow network based generative models for non-iterative diverse candidate generation. In NeurIPS.
Bengio, Y. (2022). System 2 Deep Learning: Higher-Level Cognition, Agency, Out-of-Distribution Generalization and Causality. In IJCAI invited talk.
Bengio, Y., Lahlou, S., Deleu, T., Hu, E. J., Tiwari, M., and Bengio, E. (2023). GFlowNet Foundations. Journal of Machine Learning Research (JMLR).
Bengio, Y. and Malkin, N. (2024). Machine learning and information theory concepts towards an AI Mathematician.
Blei, D. M. and et al. (2017). Variational Inference: A Review for Statisticians. Journal of the American Statistical Association.
Boussif, O., Ezzine, L. N., Viviano, J. D., Koziarski, M., Jain, M., Malkin, N., Bengio, E., Assouel, R., and Bengio, Y. (2024). Action abstractions for amortized sampling.
Buesing, L., Heess, N., and Weber, T. (2020). Approximate inference in discrete distributions with monte carlo tree search and value functions. In AISTATS.
Cappello, L., Kim, J., Liu, S., and Palacios, J. A. (2021). Statistical Challenges in Tracking the Evolution of SARS-CoV-2.
Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P., and Riddell, A. (2017). Stan: A probabilistic programming language. Journal of statistical software.
da Silva, T., Alves, R., Silva, E., Souza, A., Garg, V., Kaski, S., and Mesquita, D. (2025a). When do GFlowNets learn the right Distribution? ICLR.
da Silva, T., de Souza, D., and Mesquita, D. (2024a). Streaming Bayes GFlowNets. NeurIPS.
da Silva, T., Silva, E., and Mesquita, D. (2024b). On Divergence Measures for Training GFlowNets. NeurIPS.
da Silva, T., Souza, A., Carvalho, L., Kaski, S., and Mesquita, D. (2024c). Embarrassingly Parallel GFlowNets. ICML.
da Silva, T., Souza, A., Rivasplata, O., Garg, V., Kaski, S., and Mesquita, D. (2025b). Generalization and Distributed Learning of GFlowNets. ICLR.
Deleu, T. and Bengio, Y. (2023). Generative Flow Networks: a Markov Chain Perspective.
Deleu, T., Góis, A., Emezue, C. C., Rankawat, M., Lacoste-Julien, S., Bauer, S., and Bengio, Y. (2022). Bayesian structure learning with generative flow networks. In UAI.
Du, Y. and Kaelbling, L. (2024). Compositional generative modeling: A single model is not all you need.
Garipov, T., Peuter, S. D., Yang, G., Garg, V., Kaski, S., and Jaakkola, T. (2023). Compositional sculpting of iterative generative processes. In NeurIPS.
Geyer, C. J. (1991). Markov Chain Monte Carlo Maximum Likelihood. In Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface, pages 156–163.
Grathwohl, W., Swersky, K., Hashemi, M., Duvenaud, D., and Maddison, C. J. (2021). Oops I Took A Gradient: Scalable Sampling for Discrete Distributions.
Hooker, S. (2020). The Hardware Lottery.
Hu, E. J., Jain, M., Elmoznino, E., Kaddar, Y., and et al. (2023). Amortizing intractable inference in large language models.
Kim, M., Choi, S., Yun, T., Bengio, E., Feng, L., Rector-Brooks, J., Ahn, S., Park, J., Malkin, N., and Bengio, Y. (2024a). Adaptive teachers for amortized samplers.
Kim, M., Yun, T., Bengio, E., Dinghuai Zhang, Y. B., Ahn, S., and Park, J. (2024b). Local search GFlowNets. In ICLR (ICLR).
Kingma, D. P. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
Kullback, S. and Leibler, R. A. (1951). On Information and Sufficiency. The Annals of Mathematical Statistics.
Lahlou, S., Deleu, T., Lemos, P., Zhang, D., Volokhova, A., Hernández-García, A., Ezzine, L. N., Bengio, Y., and Malkin, N. (2023). A theory of continuous generative flow networks. In ICML.
Lau, E., Lu, S. Z., Pan, L., Precup, D., and Bengio, E. (2024). QGFN: Controllable Greediness with Action Values.
Lindley, D. V. (1972). Bayesian statistics: A review. SIAM.
Liu, J. S. and Liu, J. S. (2001). Monte Carlo strategies in scientific computing, volume 10. Springer.
Lázaro-Gredilla, M., Ku, L. Y., Murphy, K. P., and George, D. (2025). What type of inference is planning?
Malkin, N., Jain, M., Bengio, E., Sun, C., and Bengio, Y. (2022). Trajectory balance: Improved credit assignment in GFlownets. In NeurIPS.
Malkin, N., Lahlou, S., Deleu, T., Ji, X., Hu, E., Everett, K., Zhang, D., and Bengio, Y. (2023). GFlowNets and variational inference. ICLR (ICLR).
McAllester, D. A. (1999). PAC-Bayesian model averaging. In COLT, pages 164–170.
Neal, R. M. et al. (2011). MCMC using Hamiltonian dynamics. Hand-book of markov chain monte carlo.
Owen, A. B. (2013). Monte Carlo theory, methods and examples.
Pandey, M., Subbaraj, G., and Bengio, E. (2024). GFlowNet Pretraining with Inexpensive Rewards. arXiv preprint arXiv:2409.09702.
Parr, T., Pezzulo, G., and Friston, K. J. (2022). Active Inference: The Free Energy Principle in Mind, Brain, and Behavior. The MIT Press, Cambridge, MA.
Rényi, A. (1961). On measures of entropy and information. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics. University of California Press.
Rhodes, B. and Gutmann, M. U. (2019). Variational noise-contrastive estimation. In AISTATS.
Richter, L., Boustati, A., Nüsken, N., Ruiz, F. J. R., and Akyildiz, Ö. D. (2020). VarGrad: A low-variance gradient estimator for variational inference. In NeurIPS.
Robert, C. P. et al. (2007). The Bayesian choice: from decision-theoretic foundations to computational implementation, volume 2. Springer.
Roy, V. (2020). Convergence diagnostics for markov chain monte carlo. Annual Review of Statistics and Its Application, 7(1):387–412.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
Shen, M. W., Bengio, E., Hajiramezanali, E., Loukas, A., Cho, K., and Biancalani, T. (2023). Towards Understanding and Improving GFlowNet Training. In ICML.
Sutton, R. (2019). The bitter lesson. Incomplete Ideas (blog), 13(1):38.
Sutton, R. S. and Barto, A. G. (2018). Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 2nd edition.
Zhang, D., Dai, H., Malkin, N., Courville, A., Bengio, Y., and Pan, L. (2023a). Let the flows tell: Solving graph combinatorial optimization problems with gflownets. In NeurIPS.
Zhang, D. W., Rainone, C., Peschl, M., and Bondesan, R. (2023b). Robust scheduling with GFlowNets. In ICLR (ICLR).
Zhou, M. Y., Yan, Z., Layne, E., Malkin, N., Zhang, D., Jain, M., Blanchette, M., and Bengio, Y. (2024). PhyloGFN: Phylogenetic inference with generative flow networks. In ICLR.
Association for Computing Machinery (2025). 2024 Turing Award. Accessed: 2025-03-15.
Atanackovic, L. and Bengio, E. (2024). Investigating generalization behaviours of generative flow networks.
Bengio, E., Jain, M., Korablyov, M., Precup, D., and Bengio, Y. (2021). Flow network based generative models for non-iterative diverse candidate generation. In NeurIPS.
Bengio, Y. (2022). System 2 Deep Learning: Higher-Level Cognition, Agency, Out-of-Distribution Generalization and Causality. In IJCAI invited talk.
Bengio, Y., Lahlou, S., Deleu, T., Hu, E. J., Tiwari, M., and Bengio, E. (2023). GFlowNet Foundations. Journal of Machine Learning Research (JMLR).
Bengio, Y. and Malkin, N. (2024). Machine learning and information theory concepts towards an AI Mathematician.
Blei, D. M. and et al. (2017). Variational Inference: A Review for Statisticians. Journal of the American Statistical Association.
Boussif, O., Ezzine, L. N., Viviano, J. D., Koziarski, M., Jain, M., Malkin, N., Bengio, E., Assouel, R., and Bengio, Y. (2024). Action abstractions for amortized sampling.
Buesing, L., Heess, N., and Weber, T. (2020). Approximate inference in discrete distributions with monte carlo tree search and value functions. In AISTATS.
Cappello, L., Kim, J., Liu, S., and Palacios, J. A. (2021). Statistical Challenges in Tracking the Evolution of SARS-CoV-2.
Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P., and Riddell, A. (2017). Stan: A probabilistic programming language. Journal of statistical software.
da Silva, T., Alves, R., Silva, E., Souza, A., Garg, V., Kaski, S., and Mesquita, D. (2025a). When do GFlowNets learn the right Distribution? ICLR.
da Silva, T., de Souza, D., and Mesquita, D. (2024a). Streaming Bayes GFlowNets. NeurIPS.
da Silva, T., Silva, E., and Mesquita, D. (2024b). On Divergence Measures for Training GFlowNets. NeurIPS.
da Silva, T., Souza, A., Carvalho, L., Kaski, S., and Mesquita, D. (2024c). Embarrassingly Parallel GFlowNets. ICML.
da Silva, T., Souza, A., Rivasplata, O., Garg, V., Kaski, S., and Mesquita, D. (2025b). Generalization and Distributed Learning of GFlowNets. ICLR.
Deleu, T. and Bengio, Y. (2023). Generative Flow Networks: a Markov Chain Perspective.
Deleu, T., Góis, A., Emezue, C. C., Rankawat, M., Lacoste-Julien, S., Bauer, S., and Bengio, Y. (2022). Bayesian structure learning with generative flow networks. In UAI.
Du, Y. and Kaelbling, L. (2024). Compositional generative modeling: A single model is not all you need.
Garipov, T., Peuter, S. D., Yang, G., Garg, V., Kaski, S., and Jaakkola, T. (2023). Compositional sculpting of iterative generative processes. In NeurIPS.
Geyer, C. J. (1991). Markov Chain Monte Carlo Maximum Likelihood. In Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface, pages 156–163.
Grathwohl, W., Swersky, K., Hashemi, M., Duvenaud, D., and Maddison, C. J. (2021). Oops I Took A Gradient: Scalable Sampling for Discrete Distributions.
Hooker, S. (2020). The Hardware Lottery.
Hu, E. J., Jain, M., Elmoznino, E., Kaddar, Y., and et al. (2023). Amortizing intractable inference in large language models.
Kim, M., Choi, S., Yun, T., Bengio, E., Feng, L., Rector-Brooks, J., Ahn, S., Park, J., Malkin, N., and Bengio, Y. (2024a). Adaptive teachers for amortized samplers.
Kim, M., Yun, T., Bengio, E., Dinghuai Zhang, Y. B., Ahn, S., and Park, J. (2024b). Local search GFlowNets. In ICLR (ICLR).
Kingma, D. P. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
Kullback, S. and Leibler, R. A. (1951). On Information and Sufficiency. The Annals of Mathematical Statistics.
Lahlou, S., Deleu, T., Lemos, P., Zhang, D., Volokhova, A., Hernández-García, A., Ezzine, L. N., Bengio, Y., and Malkin, N. (2023). A theory of continuous generative flow networks. In ICML.
Lau, E., Lu, S. Z., Pan, L., Precup, D., and Bengio, E. (2024). QGFN: Controllable Greediness with Action Values.
Lindley, D. V. (1972). Bayesian statistics: A review. SIAM.
Liu, J. S. and Liu, J. S. (2001). Monte Carlo strategies in scientific computing, volume 10. Springer.
Lázaro-Gredilla, M., Ku, L. Y., Murphy, K. P., and George, D. (2025). What type of inference is planning?
Malkin, N., Jain, M., Bengio, E., Sun, C., and Bengio, Y. (2022). Trajectory balance: Improved credit assignment in GFlownets. In NeurIPS.
Malkin, N., Lahlou, S., Deleu, T., Ji, X., Hu, E., Everett, K., Zhang, D., and Bengio, Y. (2023). GFlowNets and variational inference. ICLR (ICLR).
McAllester, D. A. (1999). PAC-Bayesian model averaging. In COLT, pages 164–170.
Neal, R. M. et al. (2011). MCMC using Hamiltonian dynamics. Hand-book of markov chain monte carlo.
Owen, A. B. (2013). Monte Carlo theory, methods and examples.
Pandey, M., Subbaraj, G., and Bengio, E. (2024). GFlowNet Pretraining with Inexpensive Rewards. arXiv preprint arXiv:2409.09702.
Parr, T., Pezzulo, G., and Friston, K. J. (2022). Active Inference: The Free Energy Principle in Mind, Brain, and Behavior. The MIT Press, Cambridge, MA.
Rényi, A. (1961). On measures of entropy and information. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics. University of California Press.
Rhodes, B. and Gutmann, M. U. (2019). Variational noise-contrastive estimation. In AISTATS.
Richter, L., Boustati, A., Nüsken, N., Ruiz, F. J. R., and Akyildiz, Ö. D. (2020). VarGrad: A low-variance gradient estimator for variational inference. In NeurIPS.
Robert, C. P. et al. (2007). The Bayesian choice: from decision-theoretic foundations to computational implementation, volume 2. Springer.
Roy, V. (2020). Convergence diagnostics for markov chain monte carlo. Annual Review of Statistics and Its Application, 7(1):387–412.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
Shen, M. W., Bengio, E., Hajiramezanali, E., Loukas, A., Cho, K., and Biancalani, T. (2023). Towards Understanding and Improving GFlowNet Training. In ICML.
Sutton, R. (2019). The bitter lesson. Incomplete Ideas (blog), 13(1):38.
Sutton, R. S. and Barto, A. G. (2018). Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 2nd edition.
Zhang, D., Dai, H., Malkin, N., Courville, A., Bengio, Y., and Pan, L. (2023a). Let the flows tell: Solving graph combinatorial optimization problems with gflownets. In NeurIPS.
Zhang, D. W., Rainone, C., Peschl, M., and Bondesan, R. (2023b). Robust scheduling with GFlowNets. In ICLR (ICLR).
Zhou, M. Y., Yan, Z., Layne, E., Malkin, N., Zhang, D., Jain, M., Blanchette, M., and Bengio, Y. (2024). PhyloGFN: Phylogenetic inference with generative flow networks. In ICLR.
Published
2025-07-20
How to Cite
HENRIQUE, Tiago da Silva; MESQUITA, Diego.
Streaming, Distributed, and Asynchronous Amortized Inference. In: THESIS AND DISSERTATION CONTEST (CTD), 38. , 2025, Maceió/AL.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 35-44.
ISSN 2763-8820.
DOI: https://doi.org/10.5753/ctd.2025.8117.
