Optimization of Halide Image Processing Schedules with Reinforcement Learning
Resumo
Writing efficient image processing code is a very demanding task and much programming effort is put into porting existing code to new generations of hardware. Besides, the definition of what is an efficient code varies according to the desired optimization target, such as runtime, energy consumption or memory usage. We present a semi-automatic schedule generation system for the Halide DSL that uses a Reinforcement Learning agent to choose a set of scheduling options that optimizes the runtime of the resulting application. We compare our results to the state of the art implementations of three Halide pipelines and show that our agent is able to surpass hand-tuned code and Halide’s auto-scheduler on most scenarios for CPU and GPU architectures.
Referências
Dhariwal, P., Hesse, C., Klimov, O., et al. (2017). Openai baselines. https://github.com/openai/baselines. Accessed 2018-09.
Huang, S. (2018). Introduction to various reinforcement learning algorithms, part i: Q-learning, sarsa, dqn, ddpg. https://towardsdatascience.com/72a5e0cb6287. Accessed 2018-09.
Júnior, E. P. F. D. (2012). Aprendizado por reforço sobre o problema de revisitação de páginas web. Master’s thesis, Pós-Graduação em Informática - Pontifı́cia Universidade Católica do Rio de Janeiro, Rio de Janeiro - RJ.
Mullapudi, R. T., Adams, A., Sharlet, D., Ragan-Kelley, J., and Fatahalian, K. (2016) Automatically scheduling halide image processing pipelines. ACM Trans. Graph., 35(4):83:1–83:11.
Mullapudi, R. T., Vasista, V., and Bondhugula, U. (2015). Polymage: Automatic optimization for image processing pipelines. ACM SIGPLAN Notices, 50(4):429–443.
Ottoni, A. L. C., Oliveira, M. S., Nepomuceno, E. G., and Lamperti, R. D. (2015). Análise do aprendizado por reforço via modelos de regressão logı́stica: Um estudo de caso no futebol de robôs. Revista Junior de Iniciação Cientı́fica em Ciências Exatas e Engenharia, 1(10):44–49.
Pan, S. J., Yang, Q., et al. (2010). A survey on transfer learning. IEEE Trans. on Knowledge and Data Engineering, 22(10):1345–1359.
Ragan-Kelley, J. and Adams, A. (2012). Halide: A language for image processing. http://halide-lang.org. Accessed 2018-08.
Ragan-Kelley, J., Adams, A., Paris, S., Levoy, M., Amarasinghe, S., and Durand, F. (2012). Decoupling algorithms from schedules for easy optimization of image processing pipelines. ACM Trans. Graph., 31(4):32:1–32:12.
Ragan-Kelley, J., Barnes, C., Adams, A., Paris, S., Durand, F., and Amarasinghe, S. (2013). Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. ACM SIGPLAN Notices, 48(6):519–530.
Ragan-Kelley, J. M. (2014). Decoupling algorithms from the organization of computation for high performance image processing. PhD thesis, MIT.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
Silva, L. M. D. D. (2016). Proposta de arquitetura em hardware para fpga da técnica qlearning de aprendizagem por reforço. Master’s thesis, Pós-Graduação em Engenharia Elétrica e de Computação - Universidade Federal do Rio Grande do Norte, Natal - RN.
Silver, D., Lever, G., Heess, N., Degris, T., et al. (2014). Deterministic policy gradient algorithms. In International Conference on Machine Learning.
Sutton, R. S. and Barto, A. G. (1998). Reinforcement learning: An introduction. MIT press Cambridge.