Multithread Approximation: A new OpenMP construct
Resumo
This study presents a new construct in OpenMP designed to facilitate the implementation of approximate computing techniques within parallel programming environments. By integrating approximation methods such as task dropping, loop perforation, and floating-point relaxation, the proposed construct aims to enhance performance and energy efficiency while maintaining acceptable accuracy levels. Experimental results on benchmark applications demonstrate a trade-off of up to 490.83%, with 55.1% of quality loss, highlighting the potential and limitations of approximate computing in parallel contexts.
Referências
float control pragma. [link].
Llvm/openmp 19.0.0git documentation. [link].
Optimize Options (Using the GNU Compiler Collection (GCC)). [link].
Bacon, D. F., Graham, S. L., and Sharp, O. J. (1994). Compiler transformations for high-performance computing. ACM Comput. Surv., 26(4):345–420.
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J. W., Lee, S.-H., and Skadron, K. (2009). Rodinia: A benchmark suite for heterogeneous computing. In 2009 IEEE International Symposium on Workload Characterization (IISWC), pages 44–54.
Goiri, I., Bianchini, R., Nagarakatte, S., and Nguyen, T. D. (2015). ApproxHadoop: Bringing Approximations to MapReduce Frameworks. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 383–397, Istanbul Turkey. ACM.
Hoffmann, H., Misailovic, S., Sidiroglou, S., Rinard, M., and Agarwal, A. Using Code Perforation to Improve Performance, Reduce Energy Consumption, and Respond to Failures.
Jiang, P., Wei, Y., Su, J., Wang, R., and Wu, B. (2022). SampleMine: A Framework for Applying Random Sampling to Subgraph Pattern Mining through Loop Perforation. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, pages 185–197, Chicago Illinois. ACM.
Kugler, L. (2015). Is ”good enough” computing good enough? Communications of the ACM, 58(5):12–14.
Lashgar, A., Atoofian, E., and Baniasadi, A. (2018). Loop Perforation in OpenACC. In 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), pages 163–170, Melbourne, Australia. IEEE.
Li, S., Park, S., and Mahlke, S. (2018a). Sculptor: Flexible approximation with selective dynamic loop perforation. In Proceedings of the International Conference on Supercomputing, volume 11, pages 341–351. ACM.
Li, S., Park, S., and Mahlke, S. (2018b). Sculptor: Flexible Approximation with Selective Dynamic Loop Perforation. In Proceedings of the 2018 International Conference on Supercomputing, pages 341–351, Beijing China. ACM.
Michie, D. (1968). “memo” functions and machine learning. Nature, 218(5136):19–22.
Mittal, S. (2016). A Survey of Techniques for Approximate Computing. ACM Computing Surveys, 48(4):1–33.
Monniaux, D. (2008). The pitfalls of verifying floating-point computations. ACM Transactions on Programming Languages and Systems, 30(3):1–41.
Oliveira, J., Gonçalves, R., and Fabrício Filho, J. (2024). OpenMP em Direção à Aproximação: Loop Perforation e Multithreading. In Anais da XV Escola Regional de Alto Desempenho de São Paulo, pages 49–52, Porto Alegre, RS, Brasil. SBC.
Parasyris, K., Georgakoudis, G., Menon, H., Diffenderfer, J., Laguna, I., Osei-Kuffuor, D., and Schordan, M. (2021). HPAC: Evaluating approximate computing techniques on HPC OpenMP applications. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–14, St. Louis Missouri. ACM.
Que, H.-H., Jin, Y., Wang, T., Liu, M.-K., Yang, X.-H., and Qiao, F. (2023). A Survey of Approximate Computing: From Arithmetic Units Design to High-Level Applications. Journal of Computer Science and Technology, 38(2):251–272.
Reis, L. and Wanner, L. (2021). Functional approximation and approximate parallelization with the accept compiler. In 2021 IEEE 33rd Inter national Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pages 188–197.
Rinard, M. (2006). Probabilistic accuracy bounds for fault-tolerant computations that discard tasks. In Proceedings of the 20th annual international conference on Supercomputing, pages 324–334, Cairns Queensland Australia. ACM.
Schkufza, E., Sharma, R., and Aiken, A. (2014). Stochastic optimization of floating-point programs with tunable precision. In ACM SIGPLAN Notices, volume 49, pages 53–64, New York, New York, USA. ACM Press.
Tziantzioulis, G., Hardavellas, N., and Campanoni, S. (2018). Temporal Approximate Function Memoization. IEEE Micro, 38(4):60–70.
Vassiliadis, V., Parasyris, K., Chalios, C., Antonopoulos, C. D., Lalis, S., Bellas, N., Vandierendonck, H., and Nikolopoulos, D. S. (2014). A Programming Model and Runtime System for Significance-Aware Energy-Efficient Computing. arXiv:1412.5150 [cs].
Xu, Q., Mytkowicz, T., and Kim, N. S. (2016). Approximate Computing: A Survey. IEEE Design and Test, 33(1):8–22.
Yazdanbakhsh, A., Mahajan, D., Esmaeilzadeh, H., and Lotfi-Kamran, P. (2017). AxBench: A Multiplatform Benchmark Suite for Approximate Computing. IEEE Design & Test, 34(2):60–68.