Análise de Duas Estratégias de Caracterização para Computação na Nuvem
Resumo
Entre as principais motivações de adesão à Computação na Nuvem pode-se citar a otimização de recursos computacionais e controle de custos. A melhora no uso de recursos computacionais deve ser alcançada tanto da perspectiva do usuário como do provedor. Entretanto, diferente do que ocorre em Data Centers tradicionais, os recursos da Nuvem são compartilhados entre diferentes usuários e, em geral, o provedor de serviços possui pouco ou nenhuma informação sobre o tipo de carga de trabalho submetido nas máquinas virtuais. Esta cenário pode levar a uma situação de distribuição de carga ruim resultando em violações de SLA e QoS. Através de uma metodologia analítica, este artigo apresenta a avaliação de duas estratégias de caracterização de carga de trabalho, ambas baseadas em técnicas de Aprendizagem de Máquina (Naive Bayes e Árvores de Decisão). Além disso, este trabalho discute e apresenta alguns índices de carga que podem ser coletados por agentes SNMP, impondo pouca sobrecarga ao sistema (em torno de 2%). Os resultados mostram que as Árvores de Decisão são mais rápidas, mas mais sensíveis na variação das métricas. Já o Naive Bayes possui maior precisão em algumas situações, mas os dados precisam ser discretizados para que possam ser utilizados.
Palavras-chave:
Cloud Computing, Workload Characterization, Virtualization
Referências
M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, and I. Stoica, “Above the Clouds: A View of Cloud Computing,” 2010.
L. Cherkasova, D. Gupta, and A. Vahdat, “When virtual is harder than real: Resource allocation challenges in virtual machine based it environments,” Hewlett-Packard Labs, Tech. Rep. HPL-2007-25, 2007.
O. ElMoustapha, W. James, and Y. Charles, “On the Comparison of Regression Algorithms for Computer Architecture Performance Analysis of Software Applications,” presented at the First Workshop on Statistical and Machine learning approaches applied to ARchitectures and compilaTion (SMAT07), Ghent, Belgica, 2007.
Irfan Ahmad. 2007. Easy and Efficient Disk I/O Workload Characterization in VMware ESX Server. In Proceedings of the 2007 IEEE 10th International Symposium on Workload Characterization
Jiaqing Du, Nipun Sehrawat, and Willy Zwaenepoel. 2011. Performance profiling of virtual machines. In Proceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments (VEE '11). ACM, New York, NY, USA, 3- 14.
F. Azmandian, M. Moffie, J. G. Dy, J. A. Aslam, and D. R. Kaeli, “Workload Characterization at the Virtualization Layer,” presented at the Simulation of Computer and Telecommunication Systems (MASCOTS), pp. 63–72.
M. Ben-Yehuda, M. D. Day, Z. Dubitzky, M. Factor, N. Har’El, A. Gordon, A. Liguori, O. Wasserman, and B.-A. Yassour, “The Turtles project: Design and implementation of nested virtualization,” pp. 1–6, 2010.
F. Zhang, J. Chen, H. Chen, and B. Zang, “CloudVisor: retrofitting protection of virtual machines in multi-tenant cloud with nested virtualization,” pp. 203–216, 2011.
I. H. Witten, E. Frank, and M. A. Hall, Data Mining Practical Machine Learning Tools and Techniques, 3rd ed. Morgan Kaufmann, 2005.
D. R. Mauro and K. J. Schmidt, “Essential SNMP, Second Edition,” Essential SNMP, Second Edition, Sep. 2005.
H. Mousa, K. Doshi, and E. Ould-Ahmed-Vall, “Characterizing performance in virtualized execution,” Department of Computer Science, University of California, Santa Barbara, CA, vol. 93106, 2008.
Jyun-Shiung Yang; Pangfeng Liu; Jan-Jan Wu, "Workload characteristics-aware virtual machine consolidation algorithms," Cloud Computing Technology and Science (CloudCom), 2012 IEEE 4th International Conf. vol., no., pp.42,49, 3-6 Dec. 2012
J. Wan, F. Pan, and 2. I. 2. I. Congfeng Jiang Parallel and Distributed Processing Symposium Workshops PhD Forum IPDPSW, “Placement Strategy of Virtual Machines Based on Workload Characteristics,” Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International.
I. Davis, H. Hemmati, R. Holt, M. Godfrey, and D. Neuse, “Storm Prediction in a Cloud,” plg.uwaterloo.ca.
A. Khan, X. Yan, S. Tao, and Nikos, Anerousis, “Workload characterization and prediction in the cloud: A multiple time series approach,” Network Operations and Management Symposium (NOMS), 2012 IEEE.
Branco, Kalinka Regina Lucas Jaquie Castelo.Índices de carga e desempenho em ambientes paralelos/distribuídos - modelagem e métricas. 2004. Tese (Doutorado em Ciências de Computação e Matemática Computacional) - Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, São Carlos, 2004.
Tan, Pang-Ning, Michael Steinbach, and Vipin Kumar. Introduction to Data Mining. Boston: Pearson Addison Wesley, 2005. Print.
A. Baruchi and E. T. Midorikawa, “Hypervisor Agnostic Workload Characterization Of Virtual Machines,” presented at the Parallel and Distributed Computing Systems (PDCS 2012), Las Vegas, 2012, pp. 1–8.
S. J. Russell, P. Norvig, J. F. Canny, J. M. Malik, and D. D. Edwards, Artificial intelligence: a modern approach, vol. 74. Prentice hall Englewood Cliffs, 1995.
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, “The WEKA data mining software: an update,” ACM SIGKDD Explorations Newsletter, vol. 11, no. 1, pp. 10–18, 2009.
S. Sair and M. Charney, “Memory behavior of the SPEC2000 Benchmark suite,” IBM TJ Watson Research Center Technical Report, 2000.
J. D. Johnston and A. J. Ferreira, “Sum-difference stereo transform coding,” presented at the Acoustics, Speech, and Signal Processing, 1992. ICASSP-92., 1992 IEEE International Conference on, 1992, vol. 2, pp. 569–572.
M. E. Souza Muñoz, R. Giovanni, M. F. Siqueira, T. Sutton, P. Brewer, R. S. Pereira, D. A. L. Canhos, and V. P. Canhos, “openModeller: a generic approach to species’ potential distribution modelling,” Geoinformatica, vol. 15, no. 1, pp. 111–135, Aug. 2009.
P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, “Xen and the art of virtualization,” ACM SIGOPS Operating Systems Review, vol. 37, no. 5, pp. 164–177, 2003.
D. P. Bovet and M. Cesati, Understanding the Linux Kernel. O'Reilly Media, 2008.
A. Gulati, I. Ahmad, and C. A. Waldspurger, “PARDA: proportional allocation of resources for distributed storage access,” pp. 85–98, 2009.
L. Cherkasova, D. Gupta, and A. Vahdat, “When virtual is harder than real: Resource allocation challenges in virtual machine based it environments,” Hewlett-Packard Labs, Tech. Rep. HPL-2007-25, 2007.
O. ElMoustapha, W. James, and Y. Charles, “On the Comparison of Regression Algorithms for Computer Architecture Performance Analysis of Software Applications,” presented at the First Workshop on Statistical and Machine learning approaches applied to ARchitectures and compilaTion (SMAT07), Ghent, Belgica, 2007.
Irfan Ahmad. 2007. Easy and Efficient Disk I/O Workload Characterization in VMware ESX Server. In Proceedings of the 2007 IEEE 10th International Symposium on Workload Characterization
Jiaqing Du, Nipun Sehrawat, and Willy Zwaenepoel. 2011. Performance profiling of virtual machines. In Proceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments (VEE '11). ACM, New York, NY, USA, 3- 14.
F. Azmandian, M. Moffie, J. G. Dy, J. A. Aslam, and D. R. Kaeli, “Workload Characterization at the Virtualization Layer,” presented at the Simulation of Computer and Telecommunication Systems (MASCOTS), pp. 63–72.
M. Ben-Yehuda, M. D. Day, Z. Dubitzky, M. Factor, N. Har’El, A. Gordon, A. Liguori, O. Wasserman, and B.-A. Yassour, “The Turtles project: Design and implementation of nested virtualization,” pp. 1–6, 2010.
F. Zhang, J. Chen, H. Chen, and B. Zang, “CloudVisor: retrofitting protection of virtual machines in multi-tenant cloud with nested virtualization,” pp. 203–216, 2011.
I. H. Witten, E. Frank, and M. A. Hall, Data Mining Practical Machine Learning Tools and Techniques, 3rd ed. Morgan Kaufmann, 2005.
D. R. Mauro and K. J. Schmidt, “Essential SNMP, Second Edition,” Essential SNMP, Second Edition, Sep. 2005.
H. Mousa, K. Doshi, and E. Ould-Ahmed-Vall, “Characterizing performance in virtualized execution,” Department of Computer Science, University of California, Santa Barbara, CA, vol. 93106, 2008.
Jyun-Shiung Yang; Pangfeng Liu; Jan-Jan Wu, "Workload characteristics-aware virtual machine consolidation algorithms," Cloud Computing Technology and Science (CloudCom), 2012 IEEE 4th International Conf. vol., no., pp.42,49, 3-6 Dec. 2012
J. Wan, F. Pan, and 2. I. 2. I. Congfeng Jiang Parallel and Distributed Processing Symposium Workshops PhD Forum IPDPSW, “Placement Strategy of Virtual Machines Based on Workload Characteristics,” Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International.
I. Davis, H. Hemmati, R. Holt, M. Godfrey, and D. Neuse, “Storm Prediction in a Cloud,” plg.uwaterloo.ca.
A. Khan, X. Yan, S. Tao, and Nikos, Anerousis, “Workload characterization and prediction in the cloud: A multiple time series approach,” Network Operations and Management Symposium (NOMS), 2012 IEEE.
Branco, Kalinka Regina Lucas Jaquie Castelo.Índices de carga e desempenho em ambientes paralelos/distribuídos - modelagem e métricas. 2004. Tese (Doutorado em Ciências de Computação e Matemática Computacional) - Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, São Carlos, 2004.
Tan, Pang-Ning, Michael Steinbach, and Vipin Kumar. Introduction to Data Mining. Boston: Pearson Addison Wesley, 2005. Print.
A. Baruchi and E. T. Midorikawa, “Hypervisor Agnostic Workload Characterization Of Virtual Machines,” presented at the Parallel and Distributed Computing Systems (PDCS 2012), Las Vegas, 2012, pp. 1–8.
S. J. Russell, P. Norvig, J. F. Canny, J. M. Malik, and D. D. Edwards, Artificial intelligence: a modern approach, vol. 74. Prentice hall Englewood Cliffs, 1995.
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, “The WEKA data mining software: an update,” ACM SIGKDD Explorations Newsletter, vol. 11, no. 1, pp. 10–18, 2009.
S. Sair and M. Charney, “Memory behavior of the SPEC2000 Benchmark suite,” IBM TJ Watson Research Center Technical Report, 2000.
J. D. Johnston and A. J. Ferreira, “Sum-difference stereo transform coding,” presented at the Acoustics, Speech, and Signal Processing, 1992. ICASSP-92., 1992 IEEE International Conference on, 1992, vol. 2, pp. 569–572.
M. E. Souza Muñoz, R. Giovanni, M. F. Siqueira, T. Sutton, P. Brewer, R. S. Pereira, D. A. L. Canhos, and V. P. Canhos, “openModeller: a generic approach to species’ potential distribution modelling,” Geoinformatica, vol. 15, no. 1, pp. 111–135, Aug. 2009.
P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, “Xen and the art of virtualization,” ACM SIGOPS Operating Systems Review, vol. 37, no. 5, pp. 164–177, 2003.
D. P. Bovet and M. Cesati, Understanding the Linux Kernel. O'Reilly Media, 2008.
A. Gulati, I. Ahmad, and C. A. Waldspurger, “PARDA: proportional allocation of resources for distributed storage access,” pp. 85–98, 2009.
Publicado
23/10/2013
Como Citar
BARUCHI, Artur; MIDORIKAWA, Edson Toshimi.
Análise de Duas Estratégias de Caracterização para Computação na Nuvem . In: SIMPÓSIO EM SISTEMAS COMPUTACIONAIS DE ALTO DESEMPENHO (SSCAD), 14. , 2013, Porto de Galinhas.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2013
.
p. 102-109.
DOI: https://doi.org/10.5753/wscad.2013.16779.