Predição de transtorno depressivo em redes sociais: BERT supervisionado ou ChatGPT zero-shot?


Este artigo apresenta um primeiro estudo sobre o uso do sistema de dialogo ChatGPT em uma aplicação complexa e sensível: a predição computacional de transtornos de saúde mental a partir de textos provenientes de redes sociais. Para esse fim, foi conduzido um experimento comparando uma abordagem supervisionada tradicional baseada em BERT com uma estratégia zero-shot baseada em prompts em língua natural submetidos diretamente ao sistema de diálogo. Resultados desta avaliação, levando em conta a acurácia da tarefa de classificação face à necessidade de anotação prévia de córpus da abordagem supervisionada, destacam diferentes vantagens de cada alternativa.

Palavras-chave: saúde mental, depressão, redes sociais, BERT, ChatGPT


Ansari, L. and Ji, S. (2022). Ensemble hybrid learning methods for automated depression detection. IEEE Transactions on computational Social Systems

Aragón, M. E., López-Monroy, A. P., González-Gurrola, L. C., and y Gómez, M. M. (2019). Detecting depression in social media using fine-grained emotions. In NAACL-2019 Proceedings, pages 1481–1486, Minneapolis, USA. Assoc for Comp Ling.

BigScience Workshop (2022). BLOOM (revision 4ab0472).

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., and Amodei, D. (2020). Language models are few-shot learners. In NIPS’20: Proceedings of the 34th International Conference on Neural Information Processing Systems, pages 1877–1901

Burdisso, S. G., Errecalde, M., and y Gómez, M. M. (2020). t-SS3: a text classifier with dynamic n-grams for early risk detection over text streams. Pattern Recognition Letters, 138:130–137.

Cacheda, F., Fernandez, D., Novoa, F. J., and Carneiro, V. (2019). Early detection of depression: Social network analysis and random forest techniques. J Med Internet Res, 21(6):e12554.

Chancellor, S. and Choudhury, M. D. (2020). Methods in predictive techniques for mental health status on social media: a critical review. npj Digit. Med., 3(43).

Cohan, A., Desmet, B., Yates, A., Soldaini, L., MacAvaney, S., and v Goharian (2018). SMHD: a large-scale resource for exploring online language usage for multiple mental health conditions. In COLING-2018, pages 1485–1497, Santa Fe, USA

Coppersmith, G., Dredze, M., Harman, C., Kristy, H., and Mitchell, M. (2015). CLPsych 2015 Shared Task: Depression and PTSD on Twitter. In 2nd Workshop on Computational Linguistics and Clinical Psychology, pages 31–39, Denver, USA

da Costa, P. B., Pavan, M. C., dos Santos, W. R., da Silva, S. C., and Paraboni, I. (2023). BERTabaporu: assessing a genre-specific language model for Portuguese NLP. In Recents Advances in Natural Language Processing (RANLP-2023), Varna, Bulgaria [link].

da Silva, S. C., Ferreira, T. C., Ramos, R. M. S., and Paraboni, I. (2020). Data driven and psycholinguistics motivated approaches to hate speech detection. Computacion y Systemas, 24(3):1179–1188

Devlin, J., Chang, M., Lee, K., and Toutanova, K. (2019). BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT 2019 Proceedings, pages 4171–4186, Minneapolis, USA

dos Santos, W. R., de Oliveira, R. L., and Paraboni, I. (2023). SetembroBR: a social media corpus for depression and anxiety disorder prediction. Language Resources and Evaluation

dos Santos, W. R., Funabashi, A. M. M., and Paraboni, I. (2020a). Searching Brazilian Twitter for signs of mental health issues. In 12th International Conference on Language Resources and Evaluation (LREC-2020), pages 6113–6119, Marseille, France

dos Santos, W. R., Ramos, R. M. S., and Paraboni, I. (2020b). Computational personality recognition from facebook text: psycholinguistic features, words and facets. New Review of Hypermedia and Multimedia, 25(4):268–287

Flores, A. M., Pavan, M. C., and Paraboni, I. (2022). User profiling and satisfaction inference in public information access services. Journal of Intelligent Information Systems, 58(1):67–89

Kumar, A., Sharma, A., and Arora, A. (2019). Anxious depression prediction in real-time social data. In Intl. Conf. on Advances in Engineering Science Management & Technology, Dehradun, India [link].

Kuzman, T., Mozetič, I., and Ljubešić, N. (2023). ChatGPT: Beginning of an End of Manual Linguistic Data Annotation? Use Case of Automatic Genre Identification. arXiv preprint arXiv:2303.03953

Lin, C., Hu, P., Su, H., Li, S., Mei, J., Zhou, J., and Leung, H. (2020). SenseMood: Depression Detection on Social Media, pages 407–411. Association for Computing Machinery, New York, USA

Losada, D. E. and Crestani, F. (2016). A test collection for research on depression and language use. In Experimental IR Meets Multilinguality, Multimodality, and Interaction, pages 28–39, Cham. Springer.

Losada, D. E., Crestani, F., and Parapar, J. (2017). eRISK 2017: CLEF lab on early risk prediction on the internet: experimental foundations. In LNCS 10456, pages 346–360, Cham. Springer.

Losada, D. E., Crestani, F., and Parapar, J. (2019). Overview of eRisk 2019 Early Risk Prediction on the Internet. In LNCS 11696.

Lynn, V., Goodman, A., Niederhoffer, K., Loveys, K., Resnik, P., and Schwartz, H. A. (2018). CLPsych 2018 shared task: Predicting current and future psychological health from childhood essays. In Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic, pages 37–46, New Orleans, USA

Mikolov, T., Wen-tau, S., and Zweig, G. (2013). Linguistic regularities in continuous space word representations. In Proc. of NAACL-HLT-2013, pages 746–751, Atlanta, USA. Assoc for Comp Ling

Parapar, J., Martin-Rodilla, P., Losada, D. E., and Crestani, F. (2022). Overview of eRisk 2022: Early Risk Prediction on the Internet. In Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, pages 821–850, Bologna, Italy

Pavan, M. C., dos Santos, V. G., Lan, A. G. J., ao Trevisan Martins, J., dos Santos, W. R., Deutsch, C., da Costa, P. B., Hsieh, F. C., and Paraboni, I. (2023). Morality classification in natural language text. IEEE transactions on Affective Computing, 14(1):857–863

Pavan, M. C., dos Santos, W. R., and Paraboni, I. (2020). Twitter Moral Stance Classification using Long Short-Term Memory Networks. In 9th Brazilian Conference on Intelligent Systems (BRACIS). LNAI 12319, pages 636–647. Springer

Pavan, M. C. and Paraboni, I. (2022). Cross-target stance classification as domain adaptation. In Pichardo Lagunas, O., Martınez-Miranda, J., and Martınez Seis, B., editors, Advances in Computational Intelligence - MICAI 2022 - Lecture Notes in Artificial Intelligence vol 13612, pages 15–25, Cham. Springer Nature Switzerland

Pennebaker, J. W., Francis, M. E., and Booth, R. J. (2001). Inquiry and Word Count: LIWC. Lawrence Erlbaum, Mahwah, NJ.

Qin, C., Zhang, A., Zhang, Z., Chen, J., Yasunaga, M., and Yang, D. (2023). Is ChatGPT a General-Purpose Natural Language Processing Task Solver? arXiv preprint arXiv:2302.06476

Souza, V., Nobre, J., and Becker, K. (2020). Characterization of anxiety, depression, and their comorbidity from texts of social networks. In SBBD-2020, pages 121–132, Porto Alegre, Brazil. SBC

Souza, V., Nobre, J., and Becker, K. (2021). A deep learning ensemble to classify anxiety, depression, and their comorbidity from texts of social networks. Journal of Information and Data Management, 12(3):306–325

Su, C., Xu, Z., Pathak, J., and Wang, F. (2020). Deep learning in mental health outcome research: a scoping review. Translational Psychiatry, 10(116)

Trotzek, M., Koitka, S., and Friedrich, C. M. (2018). Utilizing neural networks and linguistic metadata for early detection of depression indications in text sequences. IEEE Transactions on Knowledge and Data Engineering

Yazdavar, A. H., Mahdavinejad, M. S., Bajaj, G., Romine, W., Sheth, A., Monadjemi, A. H., Thirunarayan, K., Meddar, J. M., Myers, A., Pathak, J., and Hitzler, P. (2020). Multimodal mental health analysis in social media. PLOS ONE, 15(4):1–27

Zhang, B., Ding, D., and Jing, L. (2023). How would Stance Detection Techniques Evolve after the Launch of ChatGPT? arXiv preprint arXiv:2212.14548
DOS SANTOS, Wesley Ramos; PARABONI, Ivandré. Predição de transtorno depressivo em redes sociais: BERT supervisionado ou ChatGPT zero-shot?. In: SIMPÓSIO BRASILEIRO DE TECNOLOGIA DA INFORMAÇÃO E DA LINGUAGEM HUMANA (STIL), 14. , 2023, Belo Horizonte/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 11-21. DOI: