DepressSet: Um conjunto de dados de análises textuais sobre postagens depressivas
Resumo
Mídias sociais podem ser úteis para buscar ajuda, ou orientação sobre como lidar, ou para entender melhor o transtorno depressivo. No entanto, lidar com dados sobre tal transtorno pode ser um desafio pela sensibilidade do domínio do conteúdo, ou mesmo pela dificuldade de encontrar dados a respeito do tema. Neste trabalho apresentamos um conjunto de dados coletados de comunidades sobre depressão no Facebook no mês de Setembro de 2022. Especificamos a extração, tratamento, armazenamento e abertura dos dados, com suas limitações, desafios e aprendizados. Enriquecemos os dados capturados com análises linguísticas das postagens, e também com a predição de cada postagem utilizando um modelo de classificação textual. Finalmente, encaminhamos propostas de potenciais aplicações do conjunto de dados e suas limitações.Referências
Chou, W.-Y. S., Gaysynsky, A., Trivedi, N., e Vanderpool, R. C. (2021). Using social media for health: National data from hints 2019. Journal of Health Communication, 26(3):184193.
De Choudhury, M. (2014). Opportunities of social media in health and well-being. XRDS, 21(2):2327.
Giuntini, F. T., Cazzolato, M. T., de Jesus Dutra dos Reis, M., Campbell, A. T., Traina, A. J. M., e Ueyama, J. (2020). A review on recognizing depression in social networks: challenges and opportunities. Journal of Ambient Intelligence and Humanized Computing, 11:1–17.
Gonçalves, M. V., dos Santos, J., Ferreira, C., Zavaleta, J., Cruz, S., e Oliveira, J. (2021). Datasets curados e enriquecidos com proveniência da campanha nacional de vacinação contra covid-19. pp. 148–159.
Lima Filho., S., Ferreira da Silva., M., e Oliveira., J. (2024). A systematic analysis of depression-related discourse within facebook: A comparison between brazilian and american communities. In Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - HEALTHINF, pp. 466–473. INSTICC, SciTePress.
Low, D. M., Rumker, L., Torous, J., Cecchi, G., Ghosh, S. S., e Talkar, T. (2020). Natural language processing reveals vulnerable mental health support groups and heightened health anxiety on reddit during covid-19: Observational study. Journal of medical Internet research, 22(10):e22635.
P. Lima Filho, S., Ferreira da Silva, M., Oliveira, J., e Ruback, L. (2022). A study about gathering features in depression detection problem with health professionals community. iSys - Brazilian Journal of Information Systems, 15(1):10:110:26.
Reimers, N. e Gurevych, I. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.
Ríssola, E. A., Bahrainian, S. A., e Crestani, F. (2020). A dataset for research on depression in social media. Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization.
Shing, H.-C., Nair, S., Zirikly, A., Friedenberg, M., Daumé III, H., e Resnik, P. (2018). Expert, crowdsourced, and machine assessment of suicide risk via online postings. In Loveys, K., Niederhoffer, K., Prud’hommeaux, E., Resnik, R., e Resnik, P., editors, Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic, pp. 25–36, New Orleans, LA. Association for Computational Linguistics.
Yen, S.-C., Chu, K.-C., e Tsai, P.-Y. (2021). Prediction model of social network suicide ideation by small sample. In 2021 IEEE 22nd International Conference on Information Reuse and Integration for Data Science (IRI), pp. 385–389. IEEE.
De Choudhury, M. (2014). Opportunities of social media in health and well-being. XRDS, 21(2):2327.
Giuntini, F. T., Cazzolato, M. T., de Jesus Dutra dos Reis, M., Campbell, A. T., Traina, A. J. M., e Ueyama, J. (2020). A review on recognizing depression in social networks: challenges and opportunities. Journal of Ambient Intelligence and Humanized Computing, 11:1–17.
Gonçalves, M. V., dos Santos, J., Ferreira, C., Zavaleta, J., Cruz, S., e Oliveira, J. (2021). Datasets curados e enriquecidos com proveniência da campanha nacional de vacinação contra covid-19. pp. 148–159.
Lima Filho., S., Ferreira da Silva., M., e Oliveira., J. (2024). A systematic analysis of depression-related discourse within facebook: A comparison between brazilian and american communities. In Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - HEALTHINF, pp. 466–473. INSTICC, SciTePress.
Low, D. M., Rumker, L., Torous, J., Cecchi, G., Ghosh, S. S., e Talkar, T. (2020). Natural language processing reveals vulnerable mental health support groups and heightened health anxiety on reddit during covid-19: Observational study. Journal of medical Internet research, 22(10):e22635.
P. Lima Filho, S., Ferreira da Silva, M., Oliveira, J., e Ruback, L. (2022). A study about gathering features in depression detection problem with health professionals community. iSys - Brazilian Journal of Information Systems, 15(1):10:110:26.
Reimers, N. e Gurevych, I. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.
Ríssola, E. A., Bahrainian, S. A., e Crestani, F. (2020). A dataset for research on depression in social media. Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization.
Shing, H.-C., Nair, S., Zirikly, A., Friedenberg, M., Daumé III, H., e Resnik, P. (2018). Expert, crowdsourced, and machine assessment of suicide risk via online postings. In Loveys, K., Niederhoffer, K., Prud’hommeaux, E., Resnik, R., e Resnik, P., editors, Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic, pp. 25–36, New Orleans, LA. Association for Computational Linguistics.
Yen, S.-C., Chu, K.-C., e Tsai, P.-Y. (2021). Prediction model of social network suicide ideation by small sample. In 2021 IEEE 22nd International Conference on Information Reuse and Integration for Data Science (IRI), pp. 385–389. IEEE.
Publicado
21/07/2024
Como Citar
LIMA FILHO, Silas; SILVA, Eliel Roger da; OLIVEIRA, Jonice; SILVA, Mônica Ferreira da.
DepressSet: Um conjunto de dados de análises textuais sobre postagens depressivas. In: BRAZILIAN WORKSHOP ON SOCIAL NETWORK ANALYSIS AND MINING (BRASNAM), 13. , 2024, Brasília/DF.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 214-220.
ISSN 2595-6094.
DOI: https://doi.org/10.5753/brasnam.2024.2774.