PIOSS: A Simulation Model for the Analysis of Parallel I/O Performance Variability on Large-scale Applications
To meet ever increasing capacity and performance requirements of emerging data-intensive applications, parallel file systems (PFSs) have been employed in large-scale computing environments. In such complex storage systems, the load distribution on PFS data servers compose a major source of input/output (I/O) performance variability. Albeit mitigating such variability is desirable, understanding its sources and behavior remains a challenging task. In this research work, a differentiated approach for evaluating the parallel I/O performance variability perceived by large-scale applications is proposed. The Parallel I/O and Storage System (PIOSS) simulation model represents main components and mechanisms observed in typical PFS implementations and enables fast evaluations of large and complex scenarios. Experimental results presented in this paper demonstrate PIOSS can accurately reproduce the load balance on PFS data servers, with a confidence level of 95%.
Carns, P. H., Walter B. Ligon, I., Ross, R. B., and Thakur, R. (2000). Pvfs: A parallel file system for linux clusters. In ALS'00 Proceedings of the 4th annual Linux Showcase & Conference, volume 4, pages 317-328. USENIX Association.
Carothers, C. D., Bauer, D., and Pearce, S. (2002). Ross: A high-performance, low-memory, modular time warp system. Journal of Parallel and Distributed Computing, 62:1648-1669.
Cope, J., Liu, N., Lang, S., Carns, P., Carothers, C. D., and Ross, R. B. (2011). Codes: Enabling co-design of multi-layer exascale storage architectures. In WEST '11 Proceedings of the Workshop on Emerging Supercomputing Technologies 2011, pages 303-312.
Corbett, P. F. and Feitelson, D. G. (1996). The vesta parallel file system. ACM Transactions on Computer Systems, 14:225-264.
Erazo, M. A., Li, T., Liu, J., and Eidenbenz, S. (2012). Toward comprehensive and accurate simulation performance prediction of parallel file systems. In DSN '12 Proceedings of the 2012 42nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pages 1-12. IEEE.
Feng, B., Liu, N., He, S., and Sun, X.-H. (2014). Hpis3: Towards a high-performance simulator for hybrid parallel i/o and storage systems. In PDSW '14 Proceedings of the 9th Parallel Data Storage Workshop, pages 37-42. IEEE.
Hey, T., Tansley, S., and Tolle, K. (2009). The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research.
Inacio, E. C., Barbetta, P. A., and Dantas, M. A. R. (2017a). A statistical analysis of the performance variability of read/write operations on parallel file systems. Procedia Computer Science Special Issue: International Conference on Computational Science, ICCS 2017, 108:2393-2397.
Inacio, E. C. and Dantas, M. A. R. (2018). Iore: A flexible and distributed i/o performance evaluation tool for hyperscale storage systems. In ISCC '18 Proceedings of the IEEE Symposium on Computers and Communication, pages 1026-1031. IEEE.
Inacio, E. C., Nonaka, J., Ono, K., and Dantas, M. A. R. (2017b). Analyzing the i/o performance of post-hoc visualization of huge simulation datasets on the k computer. In WSCAD '17 Anais do XVIII Simpósio em Sistemas Computacionais de Alto Desempenho, pages 148-159. SBC.
Liu, Y., Figueiredo, R., Xu, Y., and Zhao, M. (2013). On the design and implementation of a simulator for parallel file system research. In MSST '13 Proceedings of the IEEE 29th Symposium on Mass Storage Systems and Technologies, pages 1-5. IEEE.
Molina-Estolano, E., Maltzahn, C., Bent, J., and Brandt, S. A. (2009). Building a parallel file system simulator. Journal of Physics: Conference Series, 180:1-7.
NúÃ±ez, A., Fernández, J., Filgueira, R., García, F., and Carretero, J. (2012). Simcan: A flexible, scalable and expandable simulation platform for modelling and simulating distributed architectures and applications. Simulation Modelling Practice and Theory, 20:12-32.
Reinsel, D., Gantz, J., and Rydning, J. (2018). The digitization of the world from edge to core. Technical report, IDC.
Settlemyer, B. W. (2009). A Study of Client-Based Caching For Parallel I/O. PhD thesis, Clemson University.
Son, S. W., Sehrish, S., Liao, W.-K., Oldfield, R., and Choudhary, A. (2017). Reducing i/o variability using dynamic i/o path characterization in petascale storage systems. The Journal of Supercomputing, 73:2069-2097.
Song, H., Yin, Y., Sun, X.-H., Thakur, R., and Lang, S. (2011). Server-side i/o coordination for parallel file systems. In SC '11 Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. ACM Press.
Varga, A. and Hornig, R. (2008). An overview of the omnet++ simulation environment. In Simutools '08 Proceedings of the 1st international conference on Simulation tools and techniques for communications, networks and systems & workshops, page 60. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering).
Yu, J., Liu, G., Dong, W., Li, X., Zhang, J., and Sun, F. (2017). On the load imbalance problem of i/o forwarding layer in hpc systems. In ICCC '17 Proceedings of the 3rd IEEE International Conference on Computer and Communications, pages 2424-2428. IEEE.