Prompt-based mental health screening from social media text

Wesley Ramos dos Santos; Ivandré Paraboni

doi:10.5753/brasnam.2024.1879

Wesley Ramos dos Santos USP
Ivandré Paraboni UPM

DOI: https://doi.org/10.5753/brasnam.2024.1879

Resumo

This article presents a method for prompt-based mental health screening from a large and noisy dataset of social media text. Our method uses GPT 3.5. prompting to distinguish publications that may be more relevant to the task, and then uses a straightforward bag-of-words text classifier to predict actual user labels. Results are found to be on pair with a BERT mixture of experts classifier, and incurring only a fraction of its training costs.

Referências

Al-Mosaiwi, M. and Johnstone, T. (2018). In an absolute state: Elevated use of absolutist words is a marker specific to anxiety, depression, and suicidal ideation. Clinical Psychological Science, 6(4):529–542.

American Psychiatric Association (2013). Diagnostic and Statistical Manual of Mental Disorders 5th edition. American Psychiatric Association, Arlington, VA.

Ansari, L. and Ji, S. (2022). Ensemble hybrid learning methods for automated depression detection. IEEE Transactions on computational Social Systems.

Aragón, M. E., López-Monroy, A. P., González-Gurrola, L. C., and y Gómez, M. M. (2019). Detecting depression in social media using fine-grained emotions. In NAACL-2019 Proceedings, pages 1481–1486, Minneapolis, USA. Assoc for Comp Ling.

Burdisso, S. G., Errecalde, M., and y Gómez, M. M. (2020). t-SS3: a text classifier with dynamic n-grams for early risk detection over text streams. Pattern Recognition Letters, 138:130–137.

Cohan, A., Desmet, B., Yates, A., Soldaini, L., MacAvaney, S., and v Goharian (2018). SMHD: a large-scale resource for exploring online language usage for multiple mental health conditions. In COLING-2018, pages 1485–1497, Santa Fe, USA.

da Costa, P. B., Pavan, M. C., dos Santos, W. R., da Silva, S. C., and Paraboni, I. (2023). BERTabaporu: assessing a genre-specific language model for Portuguese NLP. In Recents Advances in Natural Language Processing (RANLP-2023), pages 217–223.

da Silva, S. C., Ferreira, T. C., Ramos, R. M. S., and Paraboni, I. (2020). Data driven and psycholinguistics motivated approaches to hate speech detection. Computación y Systemas, 24(3):1179–1188.

dos Santos, W. R., de Oliveira, R. L., and Paraboni, I. (2023a). SetembroBR: a social media corpus for depression and anxiety disorder prediction. Language Resources and Evaluation.

dos Santos, W. R., Funabashi, A. M. M., and Paraboni, I. (2020). Searching Brazilian Twitter for signs of mental health issues. In 12th International Conference on Language Resources and Evaluation (LREC-2020), pages 6113–6119, Marseille, France.

dos Santos, W. R., Yoon, S., and Paraboni, I. (2023b). Mental health prediction from social media text using mixture of experts. IEEE Latin America Tr., 21(6):723–729.

Flores, A. M., Pavan, M. C., and Paraboni, I. (2022). User profiling and satisfaction inference in public information access services. Journal of Intelligent Information Systems, 58(1):67–89.

Lin, C., Hu, P., Su, H., Li, S., Mei, J., Zhou, J., and Leung, H. (2020). SenseMood: Depression Detection on Social Media, pages 407–411. Association for Computing Machinery, New York, USA.

Parapar, J., Martin-Rodilla, P., Losada, D. E., and Crestani, F. (2022). Overview of eRisk 2022: Early Risk Prediction on the Internet. In Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, pages 821–850, Bologna, Italy.

Pavan, M. C., dos Santos, V. G., Lan, A. G. J., ao Trevisan Martins, J., dos Santos, W. R., Deutsch, C., da Costa, P. B., Hsieh, F. C., and Paraboni, I. (2023). Morality classification in natural language text. IEEE transactions on Affective Computing, 14(1):857–863.

Pavan, M. C., dos Santos, W. R., and Paraboni, I. (2020). Twitter Moral Stance Classification using Long Short-Term Memory Networks. In 9th Brazilian Conference on Intelligent Systems (BRACIS). LNAI 12319, pages 636–647. Springer.

Pereira, D. B. and Paraboni, I. (2007). A language modelling tool for statistical NLP. In 5th Workshop on Information and Human Language Technology (TIL-2007). Anais do XXVII Congresso da SBC, pages 1679–1688, Rio de Janeiro. Sociedade Brasileira de Computação.

Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67.

Rangel, F., Rosso, P., Zaghouani, W., and Charfi, A. (2020). Fine-grained analysis of language varieties and demographics. Natural Language Engineering, page 1–21.

Souza, V., Nobre, J., and Becker, K. (2020). Characterization of anxiety, depression, and their comorbidity from texts of social networks. In SBBD-2020, pages 121–132, Porto Alegre, Brazil. SBC.

Souza, V., Nobre, J., and Becker, K. (2021). A deep learning ensemble to classify anxiety, depression, and their comorbidity from texts of social networks. Journal of Information and Data Management, 12(3):306–325.

Trifu, R., Nemes, B., Bodea-Hategan, C., and Cozman, D. (2017). Linguistic indicators of language in major depressive disorder (MDD). An evidence based research. Journal of Evidence-Based Psychotherapies, 17:105–128.