Operationalizing Knowledge Gain: Implementing and Testing the DKG Metric in Search Environments
Abstract
Searching the Web is increasingly recognized as a process of knowledge construction rather than simple information retrieval, a perspective framed by the paradigm of Searching as Learning (SaL). A central challenge in this domain lies in evaluating the extent to which users actually acquire knowledge during search. Traditional approaches either rely on behavioral proxies, scalable but limited in capturing conceptual change, or structured assessments, which provide direct evidence but are often intrusive. The Degree of Knowledge Gain (DKG) metric addresses this gap by modeling reductions in uncertainty through Shannon’s entropy and integrating semantic similarity between queries and clicked documents. This paper reports on the operationalization of DKG within the CNPq project 3C-BPA: Comportamento de busca, Complexidade da informação e pensamento Crítico na Busca como um Processo de Aprendizagem. Two artifacts were developed: an initial search engine prototype embedding DKG computation, and a Chrome extension that estimates DKG in real time while users employ their preferred search engines. The latter artifact overcame earlier limitations by improving ecological validity, reducing costs, and enabling more natural experimentation. An experiment combined pre- and post-tests, the Concurrent Think-Aloud (CTA) protocol, and the plug-in’s automated logging. Preliminary findings show that DKG values are sensitive to differences in search strategies, with systematic reformulation and evaluation aligning with greater knowledge gains, while disorientation behaviors corresponded to more modest outcomes. A distinctive feature of this study was the active role of an undergraduate researcher, who contributed to artifact development, experiment setup, participant support, transcription, and ongoing content analysis.
Keywords:
Degree of Knowledge Gain, Searching as Learning, Search Artifacts, Search Environment
References
Arthur Câmara. 2024. Designing Search-as-Learning Systems. Ph.D. Thesis. Delft University of Technology. DOI: 10.4233/uuid:0fe3a6bb-1bc1-40e2-86b0-ec3d3aef9c77, 〈ISBN: 978-94-6384-569-4〉.
Yu Chi, Shuguang Han, Daqing He, and Rui Meng. 2016. Exploring Knowledge Learning in Collaborative Information Seeking Process. In SAL@SIGIR.
Georges E. Dupret and Benjamin Piwowarski. 2008. A User Browsing Model to Predict Search Engine Click Data from Past Observations.. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Singapore, Singapore) (SIGIR ’08). Association for Computing Machinery, New York, NY, USA, 331–338. DOI: 10.1145/1390334.1390392
Dima El Zein and Célia da Costa Pereira. 2022. User’s Knowledge and Information Needs in Information Retrieval Evaluation. In Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization (Barcelona, Spain) (UMAP ’22). Association for Computing Machinery, New York, NY, USA, 170–178. DOI: 10.1145/3503252.3531325
Dima El Zein and Célia Da Costa Pereira. 2023. The Evolution of User Knowledge during Search-as-Learning Sessions: A Benchmark and Baseline. In Proceedings of the 2023 Conference on Human Information Interaction and Retrieval (Austin, TX, USA) (CHIIR ’23). Association for Computing Machinery, New York, NY, USA, 454–458. DOI: 10.1145/3576840.3578273
Wolfgang Gritz, Anett Hoppe, and Ralph Ewerth. 2021. On the Impact of Features and Classifiers for Measuring Knowledge Gain during Web Search-A Case Study. (2021).
Wolfgang Gritz, Anett Hoppe, and Ralph Ewerth. 2024. On the Influence of Reading Sequences on Knowledge Gain During Web Search. In Advances in Information Retrieval, Nazli Goharian, Nicola Tonellotto, Yulan He, Aldo Lipani, Graham McDonald, Craig Macdonald, and Iadh Ounis (Eds.). Springer Nature Switzerland, Cham, 364–373.
Todd Kelley, Brenda Capobianco, and Kevin Kaluf. 2015. Concurrent thinkaloud protocols to assess elementary design students. International Journal of Technology & Design Education 25, 4 (2015), 521 – 540.
Chang Liu, Xiaoxuan Song, and Preben Hansen. 2023. Characterising users’ task completion process in learning-related tasks:A search pace model. Journal of Information Science 49, 6 (2023), 1462–1480. arXiv: [link] DOI: 10.1177/01655515211060527
Yaxi Liu, Chunxiu Qin, Xubu Ma, Jiangping Chen, Hongle He, and Junbo Mao. 2025. Characterising exploratory search tasks: Evidence from different fields. Journal of Information Science 0, 0 (2025), 01655515251330611. arXiv: [link] DOI: 10.1177/01655515251330611
Jie “Jennifer” Lu. 2025. A systematic literature review of usability definitions and psychometric properties of instruments in the field of learning design and technology. Journal of Research on Technology in Education 0, 0 (2025), 1–22. arXiv: [link] DOI: 10.1080/15391523.2025. 2459165
Gary Marchionini. 2006. Exploratory Search: From Finding to Understanding. Commun. ACM 49, 4 (April 2006), 41–46. DOI: 10.1145/1121949.1121979
Peter Milne. 2012. Probability as a Measure of Information Added. Journal of Logic, Language and Information 21, 2 (April 2012), 163–188. DOI: 10.1007/s10849-011-9142-0
Christian Otto, Markus Rokicki, Georg Pardi, Wolfgang Gritz, Daniel Hienert, Ran Yu, Johannes von Hoyer, Anett Hoppe, Stefan Dietze, Peter Holtz, Yvonne Kammerer, and Ralph Ewerth. 2022. SaL-Lightning Dataset: Search and Eye Gaze Behavior, Resource Interactions and Knowledge Gain during Web Search. In ACM SIGIR Conference on Human Information Interaction and Retrieval. ACM, Regensburg Germany, 347–352. DOI: 10.1145/3498366.3505835
Christian Otto, Ran Yu, Georg Pardi, Johannes von Hoyer, Markus Rokicki, Anett Hoppe, Peter Holtz, Yvonne Kammerer, Stefan Dietze, and Ralph Ewerth. 2021. Predicting Knowledge Gain During Web Search Based on Multimedia Resource Consumption. In Artificial Intelligence in Education, Ido Roll, Danielle McNamara, Sergey Sosnovsky, Rose Luckin, and Vania Dimitrova (Eds.). Springer International Publishing, Cham, 318–330.
Alfonso Prieto-Guerrero and Gilberto Espinosa-Paredes. 2019. 7 - Nonlinear signal processing methods: DR estimation and nonlinear stability indicators. In Linear and Non-Linear Stability Analysis in Boiling Water Reactors, Alfonso Prieto-Guerrero and Gilberto Espinosa-Paredes (Eds.).Woodhead Publishing, 315 – 398. DOI: 10.1016/B978-0-08-102445-4.00007-2
Ilknur Reisoğlu, Ayça Çebi, and Tuğba Bahçekapılı. 2019. Online information searching behaviours: examining the impact of task complexity, information searching experience, and cognitive style. Interactive Learning Environments 0, 0 (2019), 1–18. DOI: 10.1080/10494820.2019.1662456
Mark A Schmuckler. 2001. What is ecological validity? A dimensional analysis. Infancy 2, 4 (2001), 419–436.
Marcelo Tibau. 2024. Quantifying Knowledge Gain in Online Searches: The DKG Metric. Tese (Doutorado em Informática). Universidade Federal do Estado do Rio de Janeiro (UNIRIO), Rio de Janeiro. Programa de Pós-Graduação em Informática.
Marcelo Tibau, SeanWolfgand Matsui Siqueira, and Bernardo Pereira Nunes. 2022. The Impact of Non-Verbalization in Think-Aloud: Understanding Knowledge Gain Indicators Considering Think-Aloud Web Searches. In Proceedings of the 33rd ACM Conference on Hypertext and Social Media (Barcelona, Spain) (HT ’22). Association for Computing Machinery, New York, NY, USA, 107–120. DOI: 10.1145/3511095.3531272
Marcelo Tibau, Sean Wolfgand Matsui Siqueira, and Bernardo Pereira Nunes. 2023. Accounting for the knowledge gained during a web search: An empirical study on learning transfer indicators. Library & Information Science Research 45, 1 (2023), 101222. DOI: 10.1016/j.lisr.2022.101222
M. Tibau, S. W. M. Siqueira, B. Pereira Nunes, T. Nurmikko-Fuller, and R. F. Manrique. 2019. Using Query Reformulation to Compare Learning Behaviors in Web Search Engines. In 2019 IEEE 19th International Conference on Advanced Learning Technologies (ICALT), Vol. 2161-377X. 219–223. DOI: 10.1109/ICALT.2019. 00054
Kelsey Urgo and Jaime Arguello. 2022. Learning assessments in search-aslearning: A survey of prior work and opportunities for future research. Information Processing & Management 59, 2 (2022), 102821. DOI: 10.1016/j.ipm.2021.102821
P. Vakkari. 2016. Searching as learning: A systematization based on literature. In Journal of Information Science, 42(1). 7–18.
Luyan Xu, Xuan Zhou, and Ujwal Gadiraju. 2020. How Does Team Composition Affect Knowledge Gain of Users in Collaborative Web Search?. In Proceedings of the 31st ACM Conference on Hypertext and Social Media (Virtual Event, USA) (HT ’20). Association for Computing Machinery, New York, NY, USA, 91–100. DOI: 10.1145/3372923.3404784
Ran Yu, Rui Tang, Markus Rokicki, Ujwal Gadiraju, and Stefan Dietze. 2021. Topic-independent modeling of user knowledge in informational search sessions. Information Retrieval Journal 24, 3 (June 2021), 240–268. DOI: 10.1007/s10791-021-09391-7
Yu Chi, Shuguang Han, Daqing He, and Rui Meng. 2016. Exploring Knowledge Learning in Collaborative Information Seeking Process. In SAL@SIGIR.
Georges E. Dupret and Benjamin Piwowarski. 2008. A User Browsing Model to Predict Search Engine Click Data from Past Observations.. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Singapore, Singapore) (SIGIR ’08). Association for Computing Machinery, New York, NY, USA, 331–338. DOI: 10.1145/1390334.1390392
Dima El Zein and Célia da Costa Pereira. 2022. User’s Knowledge and Information Needs in Information Retrieval Evaluation. In Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization (Barcelona, Spain) (UMAP ’22). Association for Computing Machinery, New York, NY, USA, 170–178. DOI: 10.1145/3503252.3531325
Dima El Zein and Célia Da Costa Pereira. 2023. The Evolution of User Knowledge during Search-as-Learning Sessions: A Benchmark and Baseline. In Proceedings of the 2023 Conference on Human Information Interaction and Retrieval (Austin, TX, USA) (CHIIR ’23). Association for Computing Machinery, New York, NY, USA, 454–458. DOI: 10.1145/3576840.3578273
Wolfgang Gritz, Anett Hoppe, and Ralph Ewerth. 2021. On the Impact of Features and Classifiers for Measuring Knowledge Gain during Web Search-A Case Study. (2021).
Wolfgang Gritz, Anett Hoppe, and Ralph Ewerth. 2024. On the Influence of Reading Sequences on Knowledge Gain During Web Search. In Advances in Information Retrieval, Nazli Goharian, Nicola Tonellotto, Yulan He, Aldo Lipani, Graham McDonald, Craig Macdonald, and Iadh Ounis (Eds.). Springer Nature Switzerland, Cham, 364–373.
Todd Kelley, Brenda Capobianco, and Kevin Kaluf. 2015. Concurrent thinkaloud protocols to assess elementary design students. International Journal of Technology & Design Education 25, 4 (2015), 521 – 540.
Chang Liu, Xiaoxuan Song, and Preben Hansen. 2023. Characterising users’ task completion process in learning-related tasks:A search pace model. Journal of Information Science 49, 6 (2023), 1462–1480. arXiv: [link] DOI: 10.1177/01655515211060527
Yaxi Liu, Chunxiu Qin, Xubu Ma, Jiangping Chen, Hongle He, and Junbo Mao. 2025. Characterising exploratory search tasks: Evidence from different fields. Journal of Information Science 0, 0 (2025), 01655515251330611. arXiv: [link] DOI: 10.1177/01655515251330611
Jie “Jennifer” Lu. 2025. A systematic literature review of usability definitions and psychometric properties of instruments in the field of learning design and technology. Journal of Research on Technology in Education 0, 0 (2025), 1–22. arXiv: [link] DOI: 10.1080/15391523.2025. 2459165
Gary Marchionini. 2006. Exploratory Search: From Finding to Understanding. Commun. ACM 49, 4 (April 2006), 41–46. DOI: 10.1145/1121949.1121979
Peter Milne. 2012. Probability as a Measure of Information Added. Journal of Logic, Language and Information 21, 2 (April 2012), 163–188. DOI: 10.1007/s10849-011-9142-0
Christian Otto, Markus Rokicki, Georg Pardi, Wolfgang Gritz, Daniel Hienert, Ran Yu, Johannes von Hoyer, Anett Hoppe, Stefan Dietze, Peter Holtz, Yvonne Kammerer, and Ralph Ewerth. 2022. SaL-Lightning Dataset: Search and Eye Gaze Behavior, Resource Interactions and Knowledge Gain during Web Search. In ACM SIGIR Conference on Human Information Interaction and Retrieval. ACM, Regensburg Germany, 347–352. DOI: 10.1145/3498366.3505835
Christian Otto, Ran Yu, Georg Pardi, Johannes von Hoyer, Markus Rokicki, Anett Hoppe, Peter Holtz, Yvonne Kammerer, Stefan Dietze, and Ralph Ewerth. 2021. Predicting Knowledge Gain During Web Search Based on Multimedia Resource Consumption. In Artificial Intelligence in Education, Ido Roll, Danielle McNamara, Sergey Sosnovsky, Rose Luckin, and Vania Dimitrova (Eds.). Springer International Publishing, Cham, 318–330.
Alfonso Prieto-Guerrero and Gilberto Espinosa-Paredes. 2019. 7 - Nonlinear signal processing methods: DR estimation and nonlinear stability indicators. In Linear and Non-Linear Stability Analysis in Boiling Water Reactors, Alfonso Prieto-Guerrero and Gilberto Espinosa-Paredes (Eds.).Woodhead Publishing, 315 – 398. DOI: 10.1016/B978-0-08-102445-4.00007-2
Ilknur Reisoğlu, Ayça Çebi, and Tuğba Bahçekapılı. 2019. Online information searching behaviours: examining the impact of task complexity, information searching experience, and cognitive style. Interactive Learning Environments 0, 0 (2019), 1–18. DOI: 10.1080/10494820.2019.1662456
Mark A Schmuckler. 2001. What is ecological validity? A dimensional analysis. Infancy 2, 4 (2001), 419–436.
Marcelo Tibau. 2024. Quantifying Knowledge Gain in Online Searches: The DKG Metric. Tese (Doutorado em Informática). Universidade Federal do Estado do Rio de Janeiro (UNIRIO), Rio de Janeiro. Programa de Pós-Graduação em Informática.
Marcelo Tibau, SeanWolfgand Matsui Siqueira, and Bernardo Pereira Nunes. 2022. The Impact of Non-Verbalization in Think-Aloud: Understanding Knowledge Gain Indicators Considering Think-Aloud Web Searches. In Proceedings of the 33rd ACM Conference on Hypertext and Social Media (Barcelona, Spain) (HT ’22). Association for Computing Machinery, New York, NY, USA, 107–120. DOI: 10.1145/3511095.3531272
Marcelo Tibau, Sean Wolfgand Matsui Siqueira, and Bernardo Pereira Nunes. 2023. Accounting for the knowledge gained during a web search: An empirical study on learning transfer indicators. Library & Information Science Research 45, 1 (2023), 101222. DOI: 10.1016/j.lisr.2022.101222
M. Tibau, S. W. M. Siqueira, B. Pereira Nunes, T. Nurmikko-Fuller, and R. F. Manrique. 2019. Using Query Reformulation to Compare Learning Behaviors in Web Search Engines. In 2019 IEEE 19th International Conference on Advanced Learning Technologies (ICALT), Vol. 2161-377X. 219–223. DOI: 10.1109/ICALT.2019. 00054
Kelsey Urgo and Jaime Arguello. 2022. Learning assessments in search-aslearning: A survey of prior work and opportunities for future research. Information Processing & Management 59, 2 (2022), 102821. DOI: 10.1016/j.ipm.2021.102821
P. Vakkari. 2016. Searching as learning: A systematization based on literature. In Journal of Information Science, 42(1). 7–18.
Luyan Xu, Xuan Zhou, and Ujwal Gadiraju. 2020. How Does Team Composition Affect Knowledge Gain of Users in Collaborative Web Search?. In Proceedings of the 31st ACM Conference on Hypertext and Social Media (Virtual Event, USA) (HT ’20). Association for Computing Machinery, New York, NY, USA, 91–100. DOI: 10.1145/3372923.3404784
Ran Yu, Rui Tang, Markus Rokicki, Ujwal Gadiraju, and Stefan Dietze. 2021. Topic-independent modeling of user knowledge in informational search sessions. Information Retrieval Journal 24, 3 (June 2021), 240–268. DOI: 10.1007/s10791-021-09391-7
Published
2025-11-10
How to Cite
SILVA, Rafael Tavares da; SIQUEIRA, Sean Wolfgand Matsui; TIBAU, Marcelo.
Operationalizing Knowledge Gain: Implementing and Testing the DKG Metric in Search Environments. In: UNDERGRADUATE RESEARCH CONTEST - BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB (WEBMEDIA), 31. , 2025, Rio de Janeiro/RJ.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 57-60.
ISSN 2596-1683.
DOI: https://doi.org/10.5753/webmedia_estendido.2025.16337.
