Identifying Criminal Suspects through Implicit YouTube Interactions


  • Érick S. Florentino Military Institute of Engineering (IME)
  • Ronaldo R. Goldschmidt Military Institute of Engineering (IME)
  • Maria Claudia Cavalcanti Military Institute of Engineering (IME)



Analysis, Identification, Interactions, Implicit, People, Suspects, Social Networks


The identification of criminal suspects on social networks (e.g., pedophilia, terrorism, etc.) has been highlighted in recent years. However, in the literature, interactions derived from the textual content posted on these networks are not always considered. Thus, the present work presents an algorithm, called TROY, capable of making these interactions and their impacts explicit in order to support the identification of suspects. Furthermore, given the difficulties in obtaining datasets in Portuguese for experiments, this work presents a new way to build a dataset for new experiments, using the link prediction task. The results obtained, through the experiments, demonstrate an improvement in the identification of suspects.


Download data is not yet available.


Aiello, L. M., Barrat, A., Schifanella, R., Cattuto, C., Markines, B., and Menczer, F.(2012). Friendship prediction and homophily in social media. ACM Transactions on the Web (TWEB), 6(2):1–33.

Andrijauskas, A., Shimabukuro, A., and Maia, R. F. (2017). Desenvolvimento de base de dados em língua portuguesa sobre crimes sexuais. VII Simpósio de Iniciação Científica, Didática e de Ações Sociais da FEI.

Benevenuto, F., Duarte, F., Rodrigues, T., Almeida, V. A., Almeida, J. M., and Ross,K. W. (2008). Understanding video interactions in youtube. In Proceedings of the 16th ACM international conference on Multimedia, pages 761–764.

Benevenuto, F., Rodrigues, T., Almeida, V., Almeida, J., and Gonçalves, M. (2009). Detecting spammers and content promoters in online video social networks. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 620–627.

Berry Michael, W. (2004). Automatic discovery of similar words. Survey of Text Mining: Clustering, Classification and Retrieval”, Springer Verlag, New York, LLC, pages 24–43.

Bretschneider, U. and Peters, R. (2016). Detecting cyberbullying in online communities. European Conference on Information Systems.

Bretschneider, U., Wohner, T., and Peters, R. (2014). Detecting online harassment in social networks. International Conference on Information Systems.

Chandrasekaran, B., Josephson, J. R., and Benjamins, V. R. (1999). What are ontologies, and why do we need them? IEEE Intelligent Systems and their applications, 14(1):20–26.

Corrêa, P., Gomes, C., de Carvalho Moura, A. M., and Cavalcanti, M. C. (2015). A multi-ontology approach to annotate scientific documents based on a modularization technique. J. Biomed. Informatics, 58:208–219.

Costa, A. O. (2019). Ciberterrorismo. Intertem@ s ISSN 1677-1281, 38(38).

da Silva Soares, P. R. and Prudencio, R. B. C. (2012). Time series based link prediction. In The 2012 international joint conference on neural networks (IJCNN), pages 1–7. IEEE

Dorogovtsev, S. N. and Mendes, J. F. (2002). Evolution of networks. Advances in physics, 51(4):1079–1187.

dos Santos, L. F. and Guedes, G. (2020). Identificação de predadores sexuais brasileiros em conversas textuais na internet por meio de aprendizagem de máquina. iSys-Brazilian Journal of Information Systems, 13(4):22–47.

Dynel, M. (2014). Participation framework underlying youtube interaction. Journal of Pragmatics, 73:37–52.

Elzinga, P., Wolff, K. E., and Poelmans, J. (2012). Analyzing chat conversations of pedophiles with temporal relational semantic systems. In 2012 European Intelligence and Security Informatics Conference, pages 242–249. IEEE.

Fernández, A. (2011). Clinical report: The impact of social media on children, adolescents and families. Archivos de Pediatría del Uruguay, 82(1):31–32.

Figueiredo, D. R. (2011). Introdução a redes complexas. Atualizações em Informática, pages 303–358.

Fire, M., Katz, G., and Elovici, Y. (2012). Strangers intrusion detection-detecting spammers and fake profiles in social networks based on topology anomalies. Human Journal, pages 26–39.

Florentino., E., Goldschmidt., R., and Cavalcanti., M. (2021). Identifying suspects on social networks: An approach based on non-structured and non-labeled data. In Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 1: ICEIS, pages 51–62. INSTICC, SciTePress.

Florentino, E. d. S., Goldschmidt, R. R., and Cavalcanti, M. C. R. (2021). Exploring interactions in youtube to support the identification of crime suspects. In XVII Brazilian Symposium on Information Systems, pages 1–8.

Florentino, E. S., Cavalcante, A. A., and Goldschmidt, R. R. (2020a). An edge creation history retrieval based method to predict links in social networks. Knowledge-Based Systems, 205:106268.

Florentino, E. S., Goldschmidt, R. R., and Cavalcanti, M. C. (2020b). Identifying criminal suspects on social networks: A vocabulary-based method. In Proceedings of the Brazilian Symposium on Multimedia and the Web, pages 273–276.

Hobbs, J. R. and Pan, F. (2006). Time ontology in owl. W3C working draft, 27:133.

Huang, Z. and Lin, D. K. (2009). The time-series link prediction problem with applications in communication surveillance. INFORMS Journal on Computing, 21(2):286–303.

Kejriwal, N., Kumar, S., and Shibata, T. (2016). High performance loop closure detection using bag of word pairs. Robotics and Autonomous Systems, 77:55–65.

Klausen, J., Barbieri, E. T., Reichlin-Melnick, A., and Zelin, A. Y. (2012). The youtube jihadists: A social network analysis of al-muhajiroun’s propaganda campaign. Perspectives on Terrorism, 6(1):36–53.

Kronk, C., Tran, G. Q., and Wu, D. T. (2019). Creating a queer ontology: The gender, sex, and sexual orientation (gsso) ontology. Studies in health technology and informatics, 264:208–212.

Kuang, Z., Yu, J., Li, Z., Zhang, B., and Fan, J. (2018). Integrating multi-level deep learning and concept ontology for large-scale visual recognition. Pattern Recognition, 78:198–214.

Kwon, K. H. and Gruzd, A. (2017). Is aggression contagious online? a case of swearing on donald trump’s campaign videos on youtube. In Proceedings of the 50th Hawaii International Conference on System Sciences.

Lévy, P. and Feroldi, D. (1999). Cybercultura: gli usi sociali delle nuove tecnologie. Feltrinelli.

Li, S., Huang, J., Zhang, Z., Liu, J., Huang, T., and Chen, H. (2018). Similarity-based future common neighbors model for link prediction in complex networks. Scientific reports, 8(1):1–11.

Liben-Nowell, D. and Kleinberg, J. (2007). The link-prediction problem for social networks. Journal of the American society for information science and technology, 58(7):1019–1031.

Lü, L. and Zhou, T. (2011). Link prediction in complex networks: A survey. Physica A: statistical mechanics and its applications, 390(6):1150–1170.

Mariconti, E., Suarez-Tangil, G., Blackburn, J., De Cristofaro, E., Kourtellis, N., Leontiadis, I., Serrano, J. L., and Stringhini, G. (2019). ”you know what to do proactive detection of youtube videos targeted by coordinated hate attacks. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW):1–21.

Morais, E. A. M. and Ambrosio, A. P. L. (2007). Mineração de textos (in Portuguese). Relatório Técnico–Instituto de Informática (UFG).

Moura, M. A. (2009). Informação, ferramentas ontológicas e redes sociais ad hoc: a interoperabilidade na construção detesauros e ontologias (in Portuguese). Informação & Sociedade: Estudos, 19:59–73.

Mukherjee, S. and Joshi, S. (2013). Sentiment aggregation using conceptnet ontology. In Proceedings of the Sixth International Joint Conference on Natural Language Processing, pages 570–578.

Muniz, C. P., Goldschmidt, R., and Choren, R. (2018). Combining contextual, temporal and topological information for unsupervised link prediction in social networks. Knowledge-Based Systems, 156:129–137.

Neves, F. (n.d.). Elogios de a a z (in Portuguese).

Pendar, N. (2007). Toward spotting the pedophile telling victim from predator in text chats. In International Conference on Semantic Computing (ICSC 2007), pages 235–241. IEEE.

Rosse, C. and Mejino, J. L. (2008). The foundational model of anatomy ontology. In Anatomy Ontologies for Bioinformatics, pages 59–117. Springer.

Sales, R. d. and Cafe, L. (2009). Diferenças entre tesauros e ontologias (in Portuguese). Perspectivas em Ciência da Informação, 14(1):99–116.

Santos, D. (2015). Predição de links em redes de coautoria: um estudo utilizando a teoria da evolução espectral em redes complexas. Projetos e Dissertações em Sistemas de Informação e Gestão do Conhecimento, 4(1).

Santos, L. and Guedes, G. P. (2019). Identificação de predadores sexuais brasileiros por meio de análise de conversas realizadas na internet (in Portuguese). XXXIX Congresso da Sociedade Brasileira de Computação.

Scheider, S. and Kiefer, P. (2018). (re-) localization of location-based games. In Geogames and Geoplay, pages 131–159. Springer.

Villatoro-Tello, E., Juarez-González, A., Escalante, H. J., Montes-y Gomez, M., and Pineda, L. V. (2012). A two-step approach for effective detection of misbehaving users in chats. In CLEF (Online Working Notes/Labs/Workshop), volume 1178.

Wang, A. H. (2010). Don’t follow me: Spam detection in twitter. In 2010 international conference on security and cryptography (SECRYPT), pages 1–10. IEEE.

Wang, P., Xu, B., Wu, Y., and Zhou, X. (2015). Link prediction in social networks: the state-of-the-art. Science China Information Sciences, 58(1):1–38.



How to Cite

S. Florentino, Érick, R. Goldschmidt, R., & Claudia Cavalcanti, M. (2022). Identifying Criminal Suspects through Implicit YouTube Interactions. ISys - Brazilian Journal of Information Systems, 15(1), 3:1–3:36.



Extended versions of selected articles