Performance evaluation of lossless file compression in the cloud: a study based on Eucalyptus platform
Abstract
As cloud computing becomes more commonly adopted, a problem arises concerning storage of large files in the cloud. For infrastructure providers, the cost of continuingly acquiring storage devices may be prohibitive, and, thus, file compression techniques are very prominent in this context. File compression would not only reduce on-demand storage costs, but it would also reduce transmission bandwidth and storage time. This paper evaluates and compares the performance of virtualized cloud machines and unvirtualized machines regarding file compression.
References
Chee, B. and Franklin Jr, C. (2009). Cloud computing: technologies and strategies of the ubiquitous data center. CRC.
Chorafas, D. and Francis, T. . (2011). Cloud computing strategies. CRC Press.
D, J. and Murari, K. and Raju, M. and RB, S. and Girikumar, Y. (2010). Eucalyptus Beginner’s Guide - UEC Edition.
Ghoshal, D., Canon, R., and Ramakrishnan, L. Understanding i/o performance of virtualized cloud environments.
Godard, S. (2004). Sysstat:System performance tools for the Linux OS.
He, Q., Li, Z., and Zhang, X. (2010). Study on cloud storage system based on distributed storage systems. In Computational and Information Sciences (ICCIS), 2010 International Conference on, pages 1332–1335. IEEE.
Hovestadt, M., Kao, O., Kliem, A., and Warneke, D. (2011). Evaluating adaptive compression to mitigate the effects of shared i/o in clouds. In Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on, pages 1042–1051. IEEE.
Hugos, M. and Hulitzky, D. (2010). Business in the Cloud: What Every Business Needs to Know About Cloud Computing. Wiley.
Iosup, A., Ostermann, S., Yigitbasi, N., Prodan, R., Fahringer, T., and Epema, D. (2010). Performance Analysis of Cloud Computing Services for Many-Tasks Scientific Computing. IEEE Transactions on Parallel and Distributed Systems, pages 1–16.
Krintz, C. and Calder, B. (2001). Reducing delay with dynamic selection of compression formats. In High Performance Distributed Computing, 2001. Proceedings. 10th IEEE International Symposium on, pages 266–277. IEEE.
Lilja, D.J. (2005). Measuring computer performance: a practitioner’s guide. Cambridge Univ Pr.
Miyamoto, T., Hayashi, M., and Tanaka, H. (2009). Customizing network functions for high performance cloud computing. In Network Computing and Applications, 2009. NCA 2009. Eighth IEEE International Symposium on, pages 130–133. IEEE.
Nurmi, D., Wolski, R., Grzegorczyk, C., Obertelli, G., Soman, S., Youseff, L., and Zagorodnov, D. (2009). The eucalyptus open-source cloud-computing system. In Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pages 124–131. IEEE Computer Society.
Ostermann, S., Iosup, A., Yigitbasi, N., Prodan, R., Fahringer, T., and Epema, D. (2010). A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing. pages 115–131.
Ozsoy, A. and Swany, M. (2011). Culzss: Lzss lossless data compression on cuda. In Cluster Computing (CLUSTER), 2011 IEEE International Conference on, pages 403–411. IEEE.
Pavlov, I. (april, 2012). 7-zip official website, [link].
Shafer, J. (2010). I/o virtualization bottlenecks in cloud computing today. In Proceedings of the 2nd conference on I/O virtualization, pages 5–5. USENIX Association.
Stallman, R. (april,2012). Gnu bourne again shell website, [link].
Velte, A., Velte, T., Elsenpeter, R., and Babcock, C. (2010). Cloud computing: a practical approach. McGraw-Hill.
