Adaptation of the Ferret/WebFerret Algorithm for Plagiarism Detection in Portuguese Texts: Challenges with AI-Generated Texts
Abstract
This study investigates the adaptation of the Ferret/WebFerret algorithm for plagiarism detection in Portuguese-language texts, with a particular focus on the challenges presented by AI-generated content. The implemented system conducts a phrase-by-phrase analysis using trigrams, leveraging Google’s API to search for potential source material. In experiments involving mosaic plagiarism, the algorithm proved effective in identifying instances of copied content. However, it struggled to trace the origins of AI-generated texts. The findings indicate a need for the development of new AI-driven algorithms tailored to this emerging issue.References
Basic, Z., Banovac, A., Kruzic, I., and Jerkovic, I. (2023). Better by you, better than me, chatgpt3 as writing assistance in students essays. Humanities and Social Sciences Communications, 10(1).
iThenticate (2011). Plagiarism detection software — ithenticate. [link]. Accessed: 2024-09-19.
Khalil, M. and Er, E. (2023). Will chatgpt get you caught? rethinking of plagiarism detection. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 14040, pages 475–487.
Lane, P. C. R., Lyon, C. M., and Malcolm, J. A. (2006). Demonstration of the ferret plagiarism detector. In Proceedings of the 2nd International Plagiarism Conference.
Lo, C. K. (2023). What is the impact of chatgpt on education? a rapid review of the literature. Education Sciences, 13(4):410.
Lyon, C., Malcolm, J., and Dickerson, B. (2001). Detecting short passages of similar text in large document collections. In Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing, pages 118–125. Citeseer.
Malcolm, J. A. and Lane, P. C. R. (2008). Efficient search for plagiarism on the web. In Proceedings of The International Conference on Technology Communication Education Kuwait 2008, pages 206–211.
Niskier, A., editor (2020). Michaelis Dicionário Brasileiro da Língua Portuguesa. Editora Melhoramentos, São Paulo, 8ª edição edition.
OpenAI (2024). Chatgpt. Acessado em: 19 de setembro de 2024.
Presidência da República, B. (1988). Constituição da república federativa do brasil de 1988. [link]. Acesso em: 28 de outubro de 2024. Art. 5º.
Szabo, P. D. A. (2023). Chatgpt a breakthrough in science and education: Can it fail a test? TurnItIn (2011). Um resumo da efetividade do turnitin.
Ventayen, R. J. M. (2023). Openai chatgpt generated results: Similarity index of artificial intelligence-based contents. SSRN Electronic Journal.
iThenticate (2011). Plagiarism detection software — ithenticate. [link]. Accessed: 2024-09-19.
Khalil, M. and Er, E. (2023). Will chatgpt get you caught? rethinking of plagiarism detection. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 14040, pages 475–487.
Lane, P. C. R., Lyon, C. M., and Malcolm, J. A. (2006). Demonstration of the ferret plagiarism detector. In Proceedings of the 2nd International Plagiarism Conference.
Lo, C. K. (2023). What is the impact of chatgpt on education? a rapid review of the literature. Education Sciences, 13(4):410.
Lyon, C., Malcolm, J., and Dickerson, B. (2001). Detecting short passages of similar text in large document collections. In Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing, pages 118–125. Citeseer.
Malcolm, J. A. and Lane, P. C. R. (2008). Efficient search for plagiarism on the web. In Proceedings of The International Conference on Technology Communication Education Kuwait 2008, pages 206–211.
Niskier, A., editor (2020). Michaelis Dicionário Brasileiro da Língua Portuguesa. Editora Melhoramentos, São Paulo, 8ª edição edition.
OpenAI (2024). Chatgpt. Acessado em: 19 de setembro de 2024.
Presidência da República, B. (1988). Constituição da república federativa do brasil de 1988. [link]. Acesso em: 28 de outubro de 2024. Art. 5º.
Szabo, P. D. A. (2023). Chatgpt a breakthrough in science and education: Can it fail a test? TurnItIn (2011). Um resumo da efetividade do turnitin.
Ventayen, R. J. M. (2023). Openai chatgpt generated results: Similarity index of artificial intelligence-based contents. SSRN Electronic Journal.
Published
2024-11-07
How to Cite
ELIAS, Adler Gonçalves; LOPES, Robson da Silva.
Adaptation of the Ferret/WebFerret Algorithm for Plagiarism Detection in Portuguese Texts: Challenges with AI-Generated Texts. In: REGIONAL SCHOOL ON INFORMATICS OF MATO GROSSO (ERI-MT), 13. , 2024, Alto Araguaia/MT.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 54-59.
ISSN 2447-5386.
DOI: https://doi.org/10.5753/eri-mt.2024.245825.
