Exploring Emerging Topics on Antibiotic Use in Brazilian Tweets via Unsupervised Learning
Abstract
Antibiotic misuse plays a critical role in the global spread of antimicrobial resistance, underscoring the need to understand how the public discusses and perceives antibiotic use. This study investigates emerging topics related to antibiotic use among Brazilian Twitter users by applying unsupervised learning techniques. We collected a corpus of Portuguese-language tweets and modeled the textual data using BERTimbau, a language representation model pretrained specifically for Brazilian Portuguese. To uncover latent structures in the data, we applied K-Means clustering on the sentence embeddings, enabling us to identify and interpret thematic groupings within the public discourse. The analysis revealed recurring topics such as respiratory infections, public figures, and self-care dominate discussions. These insights demonstrate the value of combining social media data with unsupervised learning to support public health communication and surveillance strategies in Brazil.References
Aggarwal, C. C. and Zhai, C. (2012). Mining text data. Springer Science & Business Media.
Andersen, B., Hair, L., Groshek, J., Krishna, A., and Walker, D. (2019). Understanding and diagnosing antimicrobial resistance on social media: a yearlong overview of data and analytics. Health communication, 34(2):248–258.
Arquembourg, J., Glaser, P., Roblot, F., Metzler, I., Gallant-Dewavrin, M., Nanguem, H. F., Mebarki, A., Voillot, P., and Schück, S. (2025). Discussions of antibiotic resistance on social media platforms: Text mining and mixed methods content analysis study. JMIR Formative Research, 9:e37160.
Batista, M. P. B., Cavalcante, F. S., Alves Cassini, S. T., and Pinto Schuenck, R. (2023). Diversity of bacteria carrying antibiotic resistance genes in hospital raw sewage in southeastern brazil. Water Science & Technology, 87(1):239–250.
Boszczowski, Í., Neto, F. C., Blangiardo, M., Baquero, O. S., Madalosso, G., de Assis, D. B., Olitta, T., and Levin, A. S. (2020). Total antibiotic use in a state-wide area and resistance patterns in brazilian hospitals: an ecologic study. The Brazilian Journal of Infectious Diseases, 24(6):479–488.
Cardoso, T. A. d. O. and Vieira, D. N. (2016). Study of mortality from infectious diseases in brazil from 2005 to 2010: risks involved in handling corpses. Ciência & Saúde Coletiva, 21:485–496.
Charles-Smith, L. E., Reynolds, T. L., Cameron, M. A., Conway, M., Lau, E. H. Y., Olsen, J. M., Pavlin, J. A., Shigematsu, M., Streichert, L. C., Suda, K. J., et al. (2015). Using social media for actionable disease surveillance and outbreak management: a systematic literature review. PLOS ONE, 10(10):e0139701.
Cinelli, M. et al. (2020). The covid-19 social media infodemic. Nature Human Behaviour, 4(10):1285–1293.
Dropa, M., da Silva, J. S. B., Andrade, A. F. C., Nakasone, D. H., Cunha, M. P. V., Ribeiro, G., de Araújo, R. S., Brandão, C. J., Ghiglione, B., Lincopan, N., et al. (2024). Spread and persistence of antimicrobial resistance genes in wastewater from human and animal sources in são paulo, brazil. Tropical Medicine & International Health, 29(5):424–433.
Garcia, K. and Berton, L. (2021). Topic detection and sentiment analysis in twitter content related to covid-19 from brazil and the usa. Applied soft computing, 101:107057.
Gomes, M. (2018). Community-acquired pneumonia: challenges of the situation in brazil.
Jelodar, H., Wang, Y., Yuan, C., Feng, X., Jiang, X., Li, L., and Lv, Y. (2019). Latent dirichlet allocation (lda) and topic modeling: models, applications, a survey. Multimedia Tools and Applications, 78(11):15169–15211.
Kendra, R. L., Karki, S., Eickholt, J. L., and Gandy, L. (2015). Characterizing the discussion of antibiotics in the twittersphere: What is the bigger picture? Journal of medical Internet research, 17(6):e154.
Kim, H., Proctor, C. R., Walker, D., and McCarthy, R. R. (2023). Understanding the consumption of antimicrobial resistance–related content on social media: Twitter analysis. Journal of Medical Internet Research, 25:e42363.
Kouzy, R. et al. (2020). Coronavirus goes viral: Quantifying the covid-19 misinformation epidemic on twitter. American Journal of Preventive Medicine, 59(2):261–263.
McCullough, A. R., Parekh, S., Rathbone, J., Del Mar, C. B., and Hoffmann, T. C. (2016). A systematic review of the public’s knowledge and beliefs about antibiotic resistance. Journal of Antimicrobial Chemotherapy, 71(1):27–33.
Organization, W. H. (2014). Antimicrobial resistance: global report on surveillance.
Roope, L. S. J., Smith, R. D., Pouwels, K. B., Buchanan, J., Abel, L., Eibich, P., El Khoury, A. C., Walker, A. S., and Robotham, J. V. (2019). The challenge of antimicrobial resistance: what economics can contribute. Science, 364(6435):eaau4679.
Scanfeld, D., Scanfeld, V., and Larson, E. L. (2010). Dissemination of health information through social networks: Twitter and antibiotics. American journal of infection control, 38(3):182–188.
Souza, F., Nogueira, R., and Lotufo, R. (2020a). Bertimbau: Pretrained bert models for brazilian portuguese. Portuguese Conference on Artificial Intelligence, pages 403–417.
Souza, L. F. P., Abreu, V., Cruz, S. J. R., and Pardo, T. A. S. (2020b). Bertimbau: Pretrained bert models for brazilian portuguese. In Proceedings of the Brazilian Conference on Intelligent Systems (BRACIS), pages 403–408. IEEE.
Ventola, C. (2015). The antibiotic resistance crisis: part 1: causes and threats. Pharmacy and Therapeutics, 40(4):277.
Zowawi, H. M., Abedalthagafi, M., Mar, F. A., Almalki, T., Kutbi, A. H., Harris-Brown, T., Harbarth, S., Balkhy, H. H., Paterson, D. L., and Hasanain, R. A. (2015). The potential role of social media platforms in community awareness of antibiotic use in the gulf cooperation council states: luxury or necessity? Journal of Medical Internet Research, 17(10):e233.
Andersen, B., Hair, L., Groshek, J., Krishna, A., and Walker, D. (2019). Understanding and diagnosing antimicrobial resistance on social media: a yearlong overview of data and analytics. Health communication, 34(2):248–258.
Arquembourg, J., Glaser, P., Roblot, F., Metzler, I., Gallant-Dewavrin, M., Nanguem, H. F., Mebarki, A., Voillot, P., and Schück, S. (2025). Discussions of antibiotic resistance on social media platforms: Text mining and mixed methods content analysis study. JMIR Formative Research, 9:e37160.
Batista, M. P. B., Cavalcante, F. S., Alves Cassini, S. T., and Pinto Schuenck, R. (2023). Diversity of bacteria carrying antibiotic resistance genes in hospital raw sewage in southeastern brazil. Water Science & Technology, 87(1):239–250.
Boszczowski, Í., Neto, F. C., Blangiardo, M., Baquero, O. S., Madalosso, G., de Assis, D. B., Olitta, T., and Levin, A. S. (2020). Total antibiotic use in a state-wide area and resistance patterns in brazilian hospitals: an ecologic study. The Brazilian Journal of Infectious Diseases, 24(6):479–488.
Cardoso, T. A. d. O. and Vieira, D. N. (2016). Study of mortality from infectious diseases in brazil from 2005 to 2010: risks involved in handling corpses. Ciência & Saúde Coletiva, 21:485–496.
Charles-Smith, L. E., Reynolds, T. L., Cameron, M. A., Conway, M., Lau, E. H. Y., Olsen, J. M., Pavlin, J. A., Shigematsu, M., Streichert, L. C., Suda, K. J., et al. (2015). Using social media for actionable disease surveillance and outbreak management: a systematic literature review. PLOS ONE, 10(10):e0139701.
Cinelli, M. et al. (2020). The covid-19 social media infodemic. Nature Human Behaviour, 4(10):1285–1293.
Dropa, M., da Silva, J. S. B., Andrade, A. F. C., Nakasone, D. H., Cunha, M. P. V., Ribeiro, G., de Araújo, R. S., Brandão, C. J., Ghiglione, B., Lincopan, N., et al. (2024). Spread and persistence of antimicrobial resistance genes in wastewater from human and animal sources in são paulo, brazil. Tropical Medicine & International Health, 29(5):424–433.
Garcia, K. and Berton, L. (2021). Topic detection and sentiment analysis in twitter content related to covid-19 from brazil and the usa. Applied soft computing, 101:107057.
Gomes, M. (2018). Community-acquired pneumonia: challenges of the situation in brazil.
Jelodar, H., Wang, Y., Yuan, C., Feng, X., Jiang, X., Li, L., and Lv, Y. (2019). Latent dirichlet allocation (lda) and topic modeling: models, applications, a survey. Multimedia Tools and Applications, 78(11):15169–15211.
Kendra, R. L., Karki, S., Eickholt, J. L., and Gandy, L. (2015). Characterizing the discussion of antibiotics in the twittersphere: What is the bigger picture? Journal of medical Internet research, 17(6):e154.
Kim, H., Proctor, C. R., Walker, D., and McCarthy, R. R. (2023). Understanding the consumption of antimicrobial resistance–related content on social media: Twitter analysis. Journal of Medical Internet Research, 25:e42363.
Kouzy, R. et al. (2020). Coronavirus goes viral: Quantifying the covid-19 misinformation epidemic on twitter. American Journal of Preventive Medicine, 59(2):261–263.
McCullough, A. R., Parekh, S., Rathbone, J., Del Mar, C. B., and Hoffmann, T. C. (2016). A systematic review of the public’s knowledge and beliefs about antibiotic resistance. Journal of Antimicrobial Chemotherapy, 71(1):27–33.
Organization, W. H. (2014). Antimicrobial resistance: global report on surveillance.
Roope, L. S. J., Smith, R. D., Pouwels, K. B., Buchanan, J., Abel, L., Eibich, P., El Khoury, A. C., Walker, A. S., and Robotham, J. V. (2019). The challenge of antimicrobial resistance: what economics can contribute. Science, 364(6435):eaau4679.
Scanfeld, D., Scanfeld, V., and Larson, E. L. (2010). Dissemination of health information through social networks: Twitter and antibiotics. American journal of infection control, 38(3):182–188.
Souza, F., Nogueira, R., and Lotufo, R. (2020a). Bertimbau: Pretrained bert models for brazilian portuguese. Portuguese Conference on Artificial Intelligence, pages 403–417.
Souza, L. F. P., Abreu, V., Cruz, S. J. R., and Pardo, T. A. S. (2020b). Bertimbau: Pretrained bert models for brazilian portuguese. In Proceedings of the Brazilian Conference on Intelligent Systems (BRACIS), pages 403–408. IEEE.
Ventola, C. (2015). The antibiotic resistance crisis: part 1: causes and threats. Pharmacy and Therapeutics, 40(4):277.
Zowawi, H. M., Abedalthagafi, M., Mar, F. A., Almalki, T., Kutbi, A. H., Harris-Brown, T., Harbarth, S., Balkhy, H. H., Paterson, D. L., and Hasanain, R. A. (2015). The potential role of social media platforms in community awareness of antibiotic use in the gulf cooperation council states: luxury or necessity? Journal of Medical Internet Research, 17(10):e233.
Published
2025-09-29
How to Cite
MAZZA, Hudson; BERTON, Lilian.
Exploring Emerging Topics on Antibiotic Use in Brazilian Tweets via Unsupervised Learning. In: NATIONAL MEETING ON ARTIFICIAL AND COMPUTATIONAL INTELLIGENCE (ENIAC), 22. , 2025, Fortaleza/CE.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 1599-1609.
ISSN 2763-9061.
DOI: https://doi.org/10.5753/eniac.2025.13765.
