Programming Practices for Cache Memory Optimization with Relevance to Embedded Systems: A Scoping Review

  • Ramiro V. dos Santos Júnior UFERSA
  • Francisco Rafael B. de Sousa UERN
  • José Inácio M. Ferreira UERN
  • Raul B. Paradeda UERN

Resumo


Cache memory efficiency is a cornerstone of performance in computing systems, particularly in embedded applications where hardware resources are constrained. While research on cache replacement and reconfiguration algorithms is well established, software-level programming practices designed to optimize memory hierarchy utilization remain underexplored. This paper presents a scoping review conducted in accordance with PRISMA-ScR guidelines to map evidence on programming techniques that directly enhance cache locality and performance in embedded systems. Empirical studies published between 2015 and 2025 were retrieved from IEEE Xplore, ACM Digital Library, ScienceDirect, and the CAPES Portal. From an initial pool of 2,609 studies, 11 were selected after applying eligibility criteria based on open-access availability, relevance to software-level cache practices, and the presence of empirical results. The selected studies were organized into four thematic categories: data layout and reordering, loop transformations, memory alignment and range propagation, and cache-aware scheduling. Across heterogeneous experimental settings, reported gains demonstrate significant improvements in speedup and energy reduction, depending on workload, platform, and measurement method. Finally, this review identifies existing research gaps and outlines directions for future investigation, including integrating identified techniques into compiler infrastructure and evaluating them on embedded Artificial Intelligence workloads.

Referências

Altman, E. A., Vaseeva, T. V., and Aleksandrov, A. V. (2019). Cache-aware algorithm for multidimensional correlations. Journal of Physics: Conference Series, 1260(4):042001.

Arksey, H. and O’Malley, L. (2005). Scoping studies: towards a methodological framework. International Journal of Social Research Methodology, 8(1):19–32.

Cattaneo, R., Natale, G., Sicignano, C., Sciuto, D., and Santambrogio, M. D. (2015). On how to accelerate iterative stencil loops: A scalable streaming-based approach. ACM Transactions on Architecture and Code Optimization, 12(4):1–26.

Gracioli, G., Alhammad, A., Mancuso, R., Fröhlich, A., and Pellizzoni, R. (2015). A survey on cache management mechanisms for real-time embedded systems. ACM Computing Surveys, 48(2).

Hennessy, J. L. and Patterson, D. A. (2017). Computer Architecture: A Quantitative Approach. Morgan Kaufmann, 6 edition.

Herlihy, M. and Liu, Z. (2017). Well-structured futures and cache locality. arXiv:1309.5301 [cs.DC]. Disponível em: [link].

Inoue, H. and Taura, K. (2015). Simd- and cache-friendly algorithm for sorting an array of structures. Proceedings of the VLDB Endowment, 8(11):1274–1285.

Jaleel, A. et al. (2010). Adaptive insertion policies for managing shared caches. IEEE Micro, 30(1):19–31.

Jung, B.-S., Kim, H.-R., and Lee, J.-H. (2023). Using cache locality for cache memory system design. Asia-pacific Journal of Convergent Research Interchange, 9(10):41–50.

Jung, H. and Yang, H. (2019). Efficiently switchable context-aware dataflow adaptation technique for low-power multi-core embedded systems. IEEE Access, 7:177974–177986.

Kautish, S. and Gurung, D. (2025). Advancing sustainable computing: A systematic literature review of software, hardware, and algorithmic innovations. ICCK Transactions on Sustainable Computing, 1(1):1–19.

Lahari, R., Lohith, T., Manjunath, M., and Raju, R. G. (2025). A review on power optimization energy efficient techniques for embedded systems. International Journal of Innovative Research in Technology (IJIRT), 11(11):390–397.

Lifflander, J. and Krishnamoorthy, S. (2017). Cache locality optimization for recursive programs. Technical Report SAND2017-5070C, Sandia National Laboratories, Albuquerque, NM, USA.

McFarling, S. (1995). Program optimization for instruction caches. Technical report, Digital Western Research Laboratory.

Megiddo, N. and Modha, D. S. (2003). Arc: A self-tuning, low-overhead replacement cache. In Proceedings of USENIX FAST.

Mu, P., Mavrogeorgis, N., Vasiladiotis, C., Tsoutsouras, V., Kaparounakis, O., Stanley-Marbell, P., and Barbalace, A. (2024). Cosense: Compiler optimizations using sensor technical specifications. Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction (CC ’24).

Oh, D.-J., Moon, Y., Lee, E., Ham, T. J., Park, Y., Lee, J. W., and Ahn, J. H. (2021). Maphea: A lightweight memory hierarchy-aware profile-guided heap allocation framework. Proceedings of the 22nd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES ’21).

Page, M. J. et al. (2021). The prisma 2020 statement: an updated guideline for reporting systematic reviews. BMJ, 372:n71.

Peters, M. D. J., Godfrey, C. M., Khalil, H., McInerney, P., Parker, D., and Soares, C. B. (2015). Guidance for conducting systematic scoping reviews. International Journal of Evidence-Based Healthcare, 13(3):141–146.

Salman, M. (2025). Challenges and practices identification via systematic literature review in the design of green/energy-efficient embedded real-time systems. International Journal of Innovations in Science & Technology, 7(1):190–209.

Stallings, W. (2018). Computer Organization and Architecture. Pearson, 11 edition.

Stratis, P. and Rajan, A. (2018). Speeding up test execution with increased cache locality. Software Testing, Verification and Reliability, 28(4).

Tabbassum, K., Talpur, S., Narejo, S., and Laghari, N.-u.-Z. (2019). Management of scratchpad memory using programming techniques. Mehran University Research Journal of Engineering & Technology, 38(2):305–312.

Tasos, A., Franco, J., Drossopoulou, S., Wrigstad, T., and Eisenbach, S. (2020). Reshape your layouts, not your programs: A safe language extension for better cache locality. 34th European Conference on Object-Oriented Programming (ECOOP 2020).

Tricco, A. C., Lillie, E., Zarin, W., O’Brien, K. K., Colquhoun, H., Levac, D., Moher, D., Peters, M. D. J., Horsley, T., Weeks, L., et al. (2018). Prisma extension for scoping reviews (prisma-scr): Checklist and explanation. Annals of Internal Medicine, 169(7):467–473.

Wolf, M. (2016). Computers as Components: Principles of Embedded Computing System Design. Morgan Kaufmann, 4 edition.
Publicado
19/07/2026
SANTOS JÚNIOR, Ramiro V. dos; SOUSA, Francisco Rafael B. de; FERREIRA, José Inácio M.; PARADEDA, Raul B.. Programming Practices for Cache Memory Optimization with Relevance to Embedded Systems: A Scoping Review. In: WORKSHOP EM DESEMPENHO DE SISTEMAS COMPUTACIONAIS E DE COMUNICAÇÃO (WPERFORMANCE), 25. , 2026, Gramado/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2026 . p. 273-284. ISSN 2595-6167. DOI: https://doi.org/10.5753/wperformance.2026.21675.