Implementing Knowledge Gain Measurement in Real Search Environments

Rafael Tavares da Silva; Sean Wolfgand Matsui Siqueira; Marcelo Tibau

doi:10.5753/sbsi_estendido.2026.249057

Rafael Tavares da Silva UNIRIO
Sean Wolfgand Matsui Siqueira UNIRIO
Marcelo Tibau UNIRIO

DOI: https://doi.org/10.5753/sbsi_estendido.2026.249057

Resumo

The operationalization of learning metrics in real search environments remains an open challenge in the Searching as Learning (SaL) paradigm. While behavioral proxies offer scalability, they capture conceptual change only indirectly; structured assessments provide more direct evidence but often compromise ecological validity. The Degree of Knowledge Gain (DKG) metric addresses this tension by combining Shannon entropy with semantic similarity between queries and clicked documents to model the progressive reduction of uncertainty during search. This paper reports on two technological artifacts developed to embed DKG computation into real-world search workflows, within the scope of the CNPq project 3C-BPA: Comportamento de busca, Complexidade da informação e pensamento Crítico na Busca como um Processo de Aprendizagem. A standalone search engine prototype established the feasibility of real-time DKG computation but exposed limitations in ecological validity and operational sustainability. These were addressed by a Chrome browser extension that estimates the metric unobtrusively while users interact with their preferred search engines. To assess the extension’s applicability, an experiment was conducted combining preand post-tests with the Concurrent Think-Aloud (CTA) protocol and automated interaction logging. Preliminary results indicate that DKG is sensitive to variation in search strategy use as participants who engaged in systematic query reformulation and multi-source evaluation achieved stronger knowledge gains, while those exhibiting disorientation and limited cognitive regulation showed more modest outcomes. Beyond its empirical contributions, the study illustrates how undergraduate research participation can play a substantive role in advancing the development and application of formal learning metrics in information science.

Referências

Câmara, A. (2024). Designing Search-as-Learning Systems. Ph.d. thesis, Delft University of Technology. DOI: 10.4233/uuid:0fe3a6bb-1bc1-40e2-86b0-ec3d3aef9c77, 〈ISBN: 978-94-6384-569-4〉.

Chi, Y., Han, S., He, D., and Meng, R. (2016). Exploring knowledge learning in collaborative information seeking process. In SAL@SIGIR.

El Zein, D. and da Costa Pereira, C. (2022). User’s knowledge and information needs in information retrieval evaluation. In Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization, UMAP ’22, page 170–178, New York, NY, USA. Association for Computing Machinery.

El Zein, D. and Da Costa Pereira, C. (2023). The evolution of user knowledge during search-as-learning sessions: A benchmark and baseline. In Proceedings of the 2023 Conference on Human Information Interaction and Retrieval, CHIIR ’23, page 454–458, New York, NY, USA. Association for Computing Machinery.

Gritz, W., Hoppe, A., and Ewerth, R. (2021). On the impact of features and classifiers for measuring knowledge gain during web search-a case study.

Gritz, W., Hoppe, A., and Ewerth, R. (2024). On the influence of reading sequences on knowledge gain during web search. In Goharian, N., Tonellotto, N., He, Y., Lipani, A., McDonald, G., Macdonald, C., and Ounis, I., editors, Advances in Information Retrieval, pages 364–373, Cham. Springer Nature Switzerland.

Kelley, T., Capobianco, B., and Kaluf, K. (2015). Concurrent think-aloud protocols to assess elementary design students. International Journal of Technology & Design Education, 25(4):521 – 540.

Liu, C., Song, X., and Hansen, P. (2023). Characterising users’ task completion process in learning-related tasks:a search pace model. Journal of Information Science, 49(6):1462–1480.

Liu, Y., Qin, C., Ma, X., Chen, J., He, H., and Mao, J. (2025). Characterising exploratory search tasks: Evidence from different fields. Journal of Information Science, 0(0):01655515251330611.

Lu, J. 2025). A systematic literature review of usability definitions and psychometric properties of instruments in the field of learning design and technology. Journal of Research on Technology in Education, 0(0):1–22.

Marchionini, G. (2006). Exploratory search: From finding to understanding. Commun. ACM, 49(4):41–46.

Milne, P. (2012). Probability as a Measure of Information Added. Journal of Logic, Language and Information, 21(2):163–188.

Otto, C., Rokicki, M., Pardi, G., Gritz, W., Hienert, D., Yu, R., von Hoyer, J., Hoppe, A., Dietze, S., Holtz, P., Kammerer, Y., and Ewerth, R. (2022). SaL-Lightning Dataset: Search and Eye Gaze Behavior, Resource Interactions and Knowledge Gain during Web Search. In ACM SIGIR Conference on Human Information Interaction and Retrieval, pages 347–352, Regensburg Germany. ACM.

Otto, C., Yu, R., Pardi, G., von Hoyer, J., Rokicki, M., Hoppe, A., Holtz, P., Kammerer, Y., Dietze, S., and Ewerth, R. (2021). Predicting knowledge gain during web search based on multimedia resource consumption. In Roll, I., McNamara, D., Sosnovsky, S., Luckin, R., and Dimitrova, V., editors, Artificial Intelligence in Education, pages 318–330, Cham. Springer International Publishing.

Prieto-Guerrero, A. and Espinosa-Paredes, G. (2019). 7 - nonlinear signal processing methods: Dr estimation and nonlinear stability indicators. In Prieto-Guerrero, A. and Espinosa-Paredes, G., editors, Linear and Non-Linear Stability Analysis in Boiling Water Reactors, Woodhead Publishing Series in Energy, pages 315 – 398. Woodhead Publishing.

Reisoğlu, I., Çebi, A., and Bahçekapılı, T. (2019). Online information searching behaviours: examining the impact of task complexity, information searching experience, and cognitive style. Interactive Learning Environments, 0(0):1–18.

Schmuckler, M. A. (2001). What is ecological validity? a dimensional analysis. Infancy, 2(4):419–436.

Tibau, M. (2024). Quantifying Knowledge Gain in Online Searches: The DKG Metric. Tese (doutorado em informática), Universidade Federal do Estado do Rio de Janeiro (UNIRIO), Rio de Janeiro. Programa de Pós-Graduação em Informática.

Tibau, M., Siqueira, S. W. M., and Nunes, B. P. (2022). The impact of non-verbalization in think-aloud: Understanding knowledge gain indicators considering think-aloud web searches. In Proceedings of the 33rd ACM Conference on Hypertext and Social Media, HT ’22, page 107–120, New York, NY, USA. Association for Computing Machinery.

Tibau, M., Siqueira, S. W. M., and Nunes, B. P. (2023). Accounting for the knowledge gained during a web search: An empirical study on learning transfer indicators. Library & Information Science Research, 45(1):101222.

Tibau, M., Siqueira, S. W. M., Nunes, B. P., Nurmikko-Fuller, T., and Manrique, R. F. (2019). Using query reformulation to compare learning behaviors in web search engines. In 2019 IEEE 19th International Conference on Advanced Learning Technologies (ICALT), volume 2161-377X, pages 219–223.

Urgo, K. and Arguello, J. (2022). Learning assessments in search-as-learning: A survey of prior work and opportunities for future research. Information Processing & Management, 59(2):102821.

Vakkari, P. (2016). Searching as learning: A systematization based on literature. In Journal of Information Science, 42(1), pages 7–18.

Xu, L., Zhou, X., and Gadiraju, U. (2020). How does team composition affect knowledge gain of users in collaborative web search? In Proceedings of the 31st ACM Conference on Hypertext and Social Media, HT ’20, page 91–100, New York, NY, USA. Association for Computing Machinery.

Yu, R., Tang, R., Rokicki, M., Gadiraju, U., and Dietze, S. (2021). Topic-independent modeling of user knowledge in informational search sessions. Information Retrieval Journal, 24(3):240–268.