Performance Data Visualization of Linux Events on Multicores
ResumoProfiling tools are essential to understand the behavior of parallel applications and assist in the optimization process. However, tools such as Perf generate a large amount of data. This way, they require significant storage space, which also complicates reasoning about this large volume of data. Therefore, we propose VisPerf: a tool-chain and an interactive visualization dashboard for Perf data. The VisPerf tool-chain profiles the application and pre-processes the data, reducing the storage space required by about 50 times. Moreover, we used the visualization dashboard to quickly understand the performance of different events and visualize specific threads and functions of a real-world application.
de Kergommeaux, J. C., de Oliveira Stein, B., and P.E., B. (2000). Pajé, an interactive visualization tool for tuning multi-threaded parallel applications.
Gregg, B. (2016). The flame graph: This visualization of software execution is a new necessity for performance profiling and debugging. Queue, 14(2):91–110.
Griebler, D., Danelutto, M., Torquati, M., and Fernandes, L. G. (2017a). SPar: A DSL for High-Level and Productive Stream Parallelism. Parallel Processing Letters, 27(01):1740005.
Griebler, D., Hoffmann, R. B., Danelutto, M., and Fernandes, L. G. (2017b). Higher-Level Parallelism Abstractions for Video Applications with SPar. In Parallel Computing is Everywhere, Proceedings of the International Conference on Parallel Computing, ParCo’17, pages 698–707, Bologna, Italy. IOS Press.
Hirzel, M., Soulé, R., Schneider, S., Gedik, B., and Grimm, R. (2014). A catalog of stream processing optimizations. ACM CSUR, 46:46.
Inselberg, A. (2009). Parallel Coordinates: Visual Multidimensional Geometry and Its Applications. Springer, New York, NY.
Linux Kernel Organization, I. (2020). Perf Wiki. Linux Kernel Organization, Inc.
Nakada, T., Yanagihashi, H., Imai, K., Ueki, H., Tsuchiya, T., Hayashikoshi, M., and Nakamura, H. (2020). An energy-efficient task scheduling for near real-time systems on heterogeneous multicore processors. IEICE Trans. Inf. Syst., 103-D(2):329–338.
Nguyen, V. A., Hardy, D., and Puaut, I. (2017). Cache-Conscious Offline Real-Time Task Scheduling for Multi-Core Processors. In Bertogna, M., editor, 29th Euromicro Conference on Real-Time Systems (ECRTS 2017), volume 76 of Leibniz International Proceedings in Informatics (LIPIcs), pages 14:1–14:22, Dagstuhl, Germany. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.
Pacheco, P. (2011). An Introduction to Parallel Programming. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1st edition.
Solihin, Y. (2016). Fundamentals of Parallel Multicore Architecture. CRC Press.
Terpstra, D., Jagode, H., You, H., and Dongarra, J. (2010). Collecting performance data with papi-c. In Müller, M. S., Resch, M. M., Schulz, A., and Nagel, W. E., editors, Tools for High Performance Computing 2009, pages 157–173, Berlin, Heidelberg. Springer.
Ward, M. O., Grinstein, G. G., and Keim, D. A. (2010). Interactive Data Visualization Foundations, Techniques, and Applications. A K Peters.
Weidendorfer, J. (2008). Sequential performance analysis with callgrind and kcachegrind. In Resch, M., Keller, R., Himmler, V., Krammer, B., and Schulz, A., editors, Tools for High Performance Computing, pages 93–113, Berlin. Springer Berlin Heidelberg.