QuickAutoML: Uma ferramenta para treinamento automatizado de modelos de aprendizado de máquina
Resumo
Com o aumento da popularidade dos modelos de aprendizado de máquina em diferentes domínios e contextos, aumentou também a necessidade por ferramentas e mecanismos capazes de facilitar a otimização e utilização prática desses modelos. Neste trabalho, apresentamos a QuickAutoML, uma ferramenta que automatiza os processos de criação, treinamento, ajuste de hiper-parâmetros e validação de modelos. O principal objetivo da ferramenta é abstrair a complexidade desses processos para permitir que usuários com pouco conhecimento técnico possam criar modelos robustos em poucos minutos.Referências
Amershi, S., Chickering, M., Drucker, S. M., Lee, B., Simard, P., and Suh, J. (2015). Modeltracker: In Proceedings of the 33rd Redesigning performance analysis tools for machine learning. Annual ACM Conference on Human Factors in Computing Systems, pages 337–346.
Arendt, D., Huang, Z., Shrestha, P., Ayton, E., Glenski, M., and Volkova, S. (2020). Crosscheck: Rapid, reproducible, and interpretable model evaluation. arXiv preprint arXiv:2004.07993.
Botchkarev, A. (2018). Performance metrics (error measures) in machine learning regression, forecasting and prognostics: Properties and typology. arXiv preprint arXiv:1809.03006.
Caruana, R. and Niculescu-Mizil, A. (2006). An empirical comparison of supervised learning algorithms. In Proceedings of the 23rd international conference on Machine learning, pages 161–168.
Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., and Hutter, F. (2015). Efficient and robust automated machine learning. In Advances in Neural Information Processing Systems 28, pages 2962–2970. Curran Associates, Inc.
James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An introduction to statistical learning, volume 112. Springer.
Miao, H., Li, A., Davis, L. S., and Deshpande, A. (2016). Modelhub: Towards unified data and lifecycle management for deep learning. arXiv preprint arXiv:1611.06224.
Olson, R. S., La Cava, W., Orzechowski, P., Urbanowicz, R. J., and Moore, J. H. (2017). Pmlb: a large benchmark suite for machine learning evaluation and comparison. BioData mining, 10(1):1–13.
Olson, R. S. and Moore, J. H. (2016). Tpot: A tree-based pipeline optimization tool for automating machine learning. In Workshop on automatic machine learning, pages 66–74. PMLR.
Paleyes, A., Urma, R.-G., and Lawrence, N. D. (2020). Challenges in deploying machine learning: a survey of case studies. arXiv preprint arXiv:2011.09926.
Patel, K., Bancroft, N., Drucker, S. M., Fogarty, J., Ko, A. J., and Landay, J. (2010). Gestalt: integrated support for implementation and analysis in machine learning. In Proceedings of the 23nd annual ACM symposium on User interface software and technology, pages 37–46.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
Tsay, J., Mummert, T., Bobroff, N., Braz, A., Westerink, P., and Hirzel, M. (2018). Runway: machine learning model experiment management tool. In Conference on Systems and Machine Learning (SysML).
Vartak, M., Subramanyam, H., Lee, W.-E., Viswanathan, S., Husnoo, S., Madden, S., and Zaharia, M. (2016). Modeldb: a system for machine learning model management. In Proceedings of the Workshop on Human-In-the-Loop Data Analytics, pages 1–3.
Wu, J., Chen, X.-Y., Zhang, H., Xiong, L.-D., Lei, H., and Deng, S.-H. (2019). Hyperparameter optimization for machine learning models based on bayesian optimization. Journal of Electronic Science and Technology, 17(1):26–40.
Yao, X. and Liu, Y. (2014). Machine learning. In Search Methodologies, pages 477–517. Springer.
Arendt, D., Huang, Z., Shrestha, P., Ayton, E., Glenski, M., and Volkova, S. (2020). Crosscheck: Rapid, reproducible, and interpretable model evaluation. arXiv preprint arXiv:2004.07993.
Botchkarev, A. (2018). Performance metrics (error measures) in machine learning regression, forecasting and prognostics: Properties and typology. arXiv preprint arXiv:1809.03006.
Caruana, R. and Niculescu-Mizil, A. (2006). An empirical comparison of supervised learning algorithms. In Proceedings of the 23rd international conference on Machine learning, pages 161–168.
Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., and Hutter, F. (2015). Efficient and robust automated machine learning. In Advances in Neural Information Processing Systems 28, pages 2962–2970. Curran Associates, Inc.
James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An introduction to statistical learning, volume 112. Springer.
Miao, H., Li, A., Davis, L. S., and Deshpande, A. (2016). Modelhub: Towards unified data and lifecycle management for deep learning. arXiv preprint arXiv:1611.06224.
Olson, R. S., La Cava, W., Orzechowski, P., Urbanowicz, R. J., and Moore, J. H. (2017). Pmlb: a large benchmark suite for machine learning evaluation and comparison. BioData mining, 10(1):1–13.
Olson, R. S. and Moore, J. H. (2016). Tpot: A tree-based pipeline optimization tool for automating machine learning. In Workshop on automatic machine learning, pages 66–74. PMLR.
Paleyes, A., Urma, R.-G., and Lawrence, N. D. (2020). Challenges in deploying machine learning: a survey of case studies. arXiv preprint arXiv:2011.09926.
Patel, K., Bancroft, N., Drucker, S. M., Fogarty, J., Ko, A. J., and Landay, J. (2010). Gestalt: integrated support for implementation and analysis in machine learning. In Proceedings of the 23nd annual ACM symposium on User interface software and technology, pages 37–46.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
Tsay, J., Mummert, T., Bobroff, N., Braz, A., Westerink, P., and Hirzel, M. (2018). Runway: machine learning model experiment management tool. In Conference on Systems and Machine Learning (SysML).
Vartak, M., Subramanyam, H., Lee, W.-E., Viswanathan, S., Husnoo, S., Madden, S., and Zaharia, M. (2016). Modeldb: a system for machine learning model management. In Proceedings of the Workshop on Human-In-the-Loop Data Analytics, pages 1–3.
Wu, J., Chen, X.-Y., Zhang, H., Xiong, L.-D., Lei, H., and Deng, S.-H. (2019). Hyperparameter optimization for machine learning models based on bayesian optimization. Journal of Electronic Science and Technology, 17(1):26–40.
Yao, X. and Liu, Y. (2014). Machine learning. In Search Methodologies, pages 477–517. Springer.
Publicado
27/10/2021
Como Citar
SIQUEIRA, Guilherme; RODRIGUES, Gustavo; FEITOSA, Eduardo; KREUTZ, Diego.
QuickAutoML: Uma ferramenta para treinamento automatizado de modelos de aprendizado de máquina. In: ESCOLA REGIONAL DE REDES DE COMPUTADORES (ERRC), 19. , 2021, Charqueadas/RS.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2021
.
p. 85-90.
DOI: https://doi.org/10.5753/errc.2021.18547.