Static Analysis for Program Execution Cost Estimation

Pedro Henrique Torres Peres Garozi; Anderson Faustino da Silva

doi:10.5753/sblp.2025.13116

Pedro Henrique Torres Peres Garozi UEM http://orcid.org/0009-0001-3547-5837
Anderson Faustino da Silva UEM https://orcid.org/0000-0002-8588-8197

DOI: https://doi.org/10.5753/sblp.2025.13116

Resumo

Program execution patterns offer valuable insights for software optimization, but accurately estimating these patterns through static analysis poses a significant challenge. While methods like machine learning and architecture simulation achieve high accuracy by utilizing extensive profiling data, they come with their own limitations in applicability and resource usage. This paper systematically evaluates control-flow analysis strategies and their combinations to estimate basic-block execution frequency. These frequency estimates are then combined with per-block cost estimations to estimate total execution cost. The investigated approach focuses on estimating branch probabilities, analyzing intra-procedural basic block frequencies, and examining inter-procedural function calls. Real execution cost is measured from a benchmark suite (cBench) to form a ground-truth profile. Correlation between cost estimates and this ground truth indicates static-method accuracy; to interpret this value, the same correlation is computed for leading simulation tools (llvm-mca, uiCA). Comparing these correlations reveals how close static analysis comes to state-of-the-art simulators. The findings show that simple heuristics achieve a correlation within 4.5% of advanced simulation tools while avoiding architecture-specific details and running in an order of magnitude less time. Additionally, analysis of how different strategies interact yields practical insights toward further improving this adaptable and efficient approach.

Palavras-chave: Static Analysis, Control-flow Analysis, Dynamic Analysis

Referências

Hervé Abdi. 2007. The Kendall rank correlation coefficient. Encyclopedia of measurement and statistics 2 (2007), 508–510.

Andreas Abel and Jan Reineke. 2022. uiCA: Accurate throughput prediction of basic blocks on recent Intel microarchitectures. In Proceedings of the 36th ACM International Conference on Supercomputing. 1–14.

Amir H Ashouri, William Killian, John Cavazos, Gianluca Palermo, and Cristina Silvano. 2018. A survey on compiler autotuning using machine learning. ACM Computing Surveys (CSUR) 51, 5 (2018), 1–42.

Jose Antonio Ayala-Barbosa and Paul Erick Mendez-Monroy. 2022. A new preemptive task scheduling framework for heterogeneous embedded systems. In Proceedings of the 2022 8th International Conference on Computer Technology Applications. 77–84.

Brad Calder, Dirk Grunwald, Michael Jones, Donald Lindsay, James Martin, Michael Mozer, and Benjamin Zorn. 1997. Evidence-based static branch prediction using machine learning. ACM Trans. Program. Lang. Syst. 19, 1 (Jan. 1997), 188–222. DOI: 10.1145/239912.239923

Veerle Desmet, Lieven Eeckhout, and Koen De Bosschere. 2005. Using decision trees to improve program-based and profile-based static branch prediction. In Advances in Computer Systems Architecture: 10th Asia-Pacific Conference, ACSAC 2005, Singapore, October 24-26, 2005. Proceedings 10. Springer, 336–352.

F Essam, Hashash El, and Shiekh Raga Hassan Ali. 2022. A comparison of the pearson, spearman rank and kendall tau correlation coefficients using quantitative variables. Asian J. Probab. Stat (2022), 36–48.

Neville Grech, Kyriakos Georgiou, James Pallister, Steve Kerrison, Jeremy Morse, and Kerstin Eder. 2015. Static analysis of energy consumption for LLVM IR programs. In Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems (Sankt Goar, Germany) (SCOPES ’15). Association for Computing Machinery, New York, NY, USA, 12–21. DOI: 10.1145/2764967.2764974

Jan Laukemann, Julian Hammer, Johannes Hofmann, Georg Hager, and Gerhard Wellein. 2018. Automated instruction stream throughput prediction for intel and amd microarchitectures. In 2018 IEEE/ACM performance modeling, benchmarking and simulation of high performance computer systems (PMBS). IEEE, 121–131.

Haoxuan Li, Paul De Meulenaere, Ken Vanherpen, Siegfried Mercelis, and Peter Hellinckx. 2020. A hybrid timing analysis method based on the isolation of software code block. Internet of Things 11 (2020), 100230.

Charith Mendis, Alex Renda, Saman Amarasinghe, and Michael Carbin. 2019. Ithemal: Accurate, portable and fast basic block throughput estimation using deep neural networks. In International Conference on machine learning. PMLR, 4505–4515.

Haolin Pan, Yuanyu Wei, Mingjie Xing, Yanjun Wu, and Chen Zhao. 2025. Towards Efficient Compiler Auto-tuning: Leveraging Synergistic Search Spaces. In Proceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization. 614–627.

Jason R. C. Patterson. 1995. Accurate static branch prediction by value range propagation. SIGPLAN Not. 30, 6 (June 1995), 67–78. DOI: 10.1145/223428.207117

Lana Scravaglieri, Ani Anciaux-Sedrakian, Olivier Aumage, Thomas Guignon, and Mihail Popov. 2025. Compiler, Runtime, and Hardware Parameters Design Space Exploration. In IPDPS 2025-39th IEEE International Parallel and Distributed Processing Symposium.

Lana Scravaglieri, Mihail Popov, Laércio Lima Pilla, Amina Guermouche, Olivier Aumage, and Emmanuelle Saillard. 2023. Optimizing performance and energy across problem sizes through a search space exploration and machine learning. J. Parallel and Distrib. Comput. 180 (2023), 104720.

Anderson Faustino da Silva, Bernardo NB De Lima, and Fernando Magno Quintão Pereira. 2021. Exploring the space of optimization sequences for code-size reduction: insights and tools. In Proceedings of the 30th ACM SIGPLAN International Conference on Compiler Construction. 47–58.

Darko Stefanović, Danilo Nikolić, Dušanka Dakić, Ivana Spasojević, and Sonja Ristić. 2020. Static code analysis tools: A systematic literature review. In Ann. DAAAM Proc. Int. DAAAM Symp, Vol. 31. 565–573.

Young-Kyoon Suh, Seounghyeon Kim, and Jeeyoung Kim. 2020. Clutch: A clustering-driven runtime estimation scheme for scientific simulations. IEEE Access 8 (2020), 220710–220722.

Ondřej Sýkora, Phitchaya Mangpo Phothilimthana, Charith Mendis, and Amir Yazdanbakhsh. 2022. Granite: A graph neural network model for basic block throughput estimation. In 2022 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 14–26.

Kai Vogelgesang, Phillip Raffeck, Peter Wägemann, Thorsten Herfet, and Wolfgang Schröder-Preikschat. 2024. WIP: Towards a Transactional Network Stack for Power-Failure Resilience. In 2024 IEEE 21st Consumer Communications & Networking Conference (CCNC). IEEE, 803–806.

Mark N. Wegman and F. Kenneth Zadeck. 1991. Constant propagation with conditional branches. ACM Trans. Program. Lang. Syst. 13, 2 (April 1991), 181–210. DOI: 10.1145/103135.103136

Carl Witt, Marc Bux, Wladislaw Gusew, and Ulf Leser. 2019. Predictive performance modeling for distributed batch processing using black box monitoring and machine learning. Information Systems 82 (2019), 33–52.

Youfeng Wu and James R. Larus. 1994. Static Branch Frequency and Program Profile Analysis. In Proceedings of the 27th Annual International Symposium on Microarchitecture (San Jose, California, USA). Association for Computing Machinery, New York, NY, USA, 1–11.

André Felipe Zanella, Anderson Faustino da Silva, and Fernando Magno Quintão. 2020. YACOS: a complete infrastructure to the design and exploration of code optimization sequences. In Proceedings of the 24th Brazilian Symposium on Context-Oriented Programming and Advanced Modularity. 56–63.