Optimization in Information Retrieval: A Quick View of Techniques for Performance and Scalability

  • André Tomitan Bocces UNESP
  • Alexandro Baldassin UNESP
  • Allberson Dantas UNESP

Abstract


Information Retrieval (IR) systems often must manage large-scale datasets while requiring efficient response times. This paper presents a summary of optimization techniques aimed at improving performance and scalability in IR algorithms. We explore methods such as parallel processing, memory hierarchy optimizations, and GPU acceleration. Our analysis identifies key factors influencing the effectiveness of these optimizations, including workload adaptability and hardware constraints. We also discuss the challenges of implementing these techniques and their impact on modern IR pipelines.

References

Buttcher, S., Clarke, C. L., and Cormack, G. V. (2016). Information retrieval: Implementing and evaluating search engines. Mit Press.

Carrera-Rivera, A., Ochoa, W., Larrinaga, F., and Lasa, G. (2022). How-to conduct a systematic literature review: A quick guide for computer science research. MethodsX, 9:101895.

Fazlali, M., Mirhosseini, M., Shahsavari, M., Shafarenko, A., and Mashinchi, M. (2024). GPU-based Parallel Technique for Solving the N-Similarity Problem in Textual Data Mining. In DCHPC 2024, pages 1–6.

Hofstätter, S., Rekabsaz, N., Eickhoff, C., and Hanbury, A. (2019). On the effect of low-frequency terms on neural-IR models. In (SIGIR), pages 1137 – 1140. Association for Computing Machinery, Inc.

Liu, Y., Wang, J., and Swanson, S. (2018). Griffin: Uniting CPU and GPU in Information Retrieval Systems for Intra-Query Parallelism. ACM SIGPLAN Notices, 53(1.0):327 – 337.

Ni, C. (2023). Top-k query optimization on the hierarchical memory structure. In (AUTEEE), pages 1075–1080.

Shaikh, T. (2017). 5. comparing performance of various optimization algorithms for effective information retrieval – a review. International Journal for Research in Applied Science and Engineering Technology.

Shrestha, S., Reddy, N., and Li, Z. (2024). ESPN: Memory-Efficient Multi-vector Information Retrieval. In (ISMM), pages 95 – 107. Association for Computing Machinery.

Wang, D., Liu, L., and Liu, Y. (2023). Normalized Storage Model Construction and Query Optimization of Book Multi-Source Heterogeneous Massive Data. IEEE Access, 11:96543–96553.

Wei, C., Qingbo, L., Na, D. E., and Congli, C. (2023). A Novel Redundant Data Retrieval Model based on Parallel Batch Algorithm. In (ICICT), pages 518–522.

Zhang, J., Naruse, A., Li, X., and Wang, Y. (2023). Parallel Top-K Algorithms on GPU: A Comprehensive Study and New Methods. In (SC23), pages 1–13.
Published
2025-05-28
BOCCES, André Tomitan; BALDASSIN, Alexandro; DANTAS, Allberson. Optimization in Information Retrieval: A Quick View of Techniques for Performance and Scalability. In: REGIONAL SCHOOL OF HIGH PERFORMANCE COMPUTING FROM SÃO PAULO (ERAD-SP), 16. , 2025, São José do Rio Preto/SP. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 62-65. DOI: https://doi.org/10.5753/eradsp.2025.9628.

Most read articles by the same author(s)

1 2 3 > >>