Design recommendations for chatbots to support people with depression


Depression has been one of the leading causes of disability worldwide. In addition to conventional drugs and clinical treatments, other forms of treatment are also available. For example, computational solutions have been developed to prevent, screen for, and assist the treatment of depression. More specifically, chatbots are computer systems that have been used to provide therapeutic support for individuals diagnosed with depression. Although these systems are commercially available, their design rationale and evaluation are still not fully validated, and further research is needed. Therefore, in this study we (1) select and compare chatbots for depression; (2) present the results of the analysis to healthcare specialists for assessment; (3) formalize the design recommendations for chatbots for people with depression; and (4) check the recommendations with mental healthcare and HCI professionals. We carried out a benchmark of chatbots for people with depression and conducted three discussion sessions involving five experts in mental healthcare and one expert in HCI. As a result, we provide a list of 24 design recommendations encompassing user interface elements, conversation styles, personalization features, among others. Finally, two healthcare and another HCI professionals read the recommendations to check adequacy to both areas.

Palavras-chave: Conversational Agents, Chatbot, Mental Health, Healthcare, Depression, Design Guidelines, Benchmark


Marcelo José Siqueira Coutinho de Almeida. 2013. Desenvolvimento de Benchmarks para sistemas multiagente: o caso da patrulha orientada a eventos. Retrieved May 5, 2022 from Publisher: Universidade Federal de Pernambuco.

American Psychiatric Association. 2013. Diagnostic and statistical manual of mental disorders: DSM-5 (5th ed. ed.). Autor, Washington, DC.

Timothy W. Bickmore, Laura M. Pfeifer, Donna Byron, Shaula Forsythe, Lori E. Henault, Brian W. Jack, Rebecca Silliman, and Michael K. Paasche-Orlow. 2010. Usability of Conversational Agents by Patients with Inadequate Health Literacy: Evidence from Two Clinical Trials. Journal of Health Communication (2010), 197--210. Publisher: Taylor & Francis

Timothy W. Bickmore, Daniel Schulman, and Candace Sidner. 2013. Automated interventions for multiple health behaviors using conversational agents. Patient Education and Counseling 92, 2 (Aug. 2013), 142--148.

GBD 2019 Mental Disorders Collaborators et al. 2022. Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990--2019: a systematic analysis for the Global Burden of Disease Study 2019. The Lancet Psychiatry (2022).

Rik Crutzen, Gjalt-Jorn Y. Peters, Sarah Dias Portugal, Erwin M. Fisser, and Jorne J. Grolleman. 2011. An Artificially Intelligent Chat Agent That Answers Adolescents' Questions Related to Sex, Drugs, and Alcohol: An Exploratory Study. Journal of Adolescent Health 48, 5 (May 2011), 514--519. Publisher: Elsevier.

Venkata Duvvuri, Qihan Guan, Swetha Daddala, Mitch Harris, and Sudhakar Kaushik. 2022. Predicting Depression Symptoms from Discord Chat Messaging Using AI Medical Chatbots. In 6th International Conference on Machine Learning and Soft Computing. ACM, Haikou China, 111--119.

Terry Ellis, Nancy K. Latham, Tamara R. DeAngelis, Cathi A. Thomas, Marie Saint-Hilaire, and Timothy W. Bickmore. 2013. Feasibility of a Virtual Exercise Coach to Promote Walking in Community-Dwelling Persons with Parkinson Disease. American Journal of Physical Medicine & Rehabilitation 92, 6 (June 2013), 472--485.

Kathleen Kara Fitzpatrick, Alison Darcy, and Molly Vierhile. 2017. Delivering Cognitive Behavior Therapy to Young Adults With Symptoms of Depression and Anxiety Using a Fully Automated Conversational Agent (Woebot): A Randomized Controlled Trial. (2017). JMIR Mental Health, Toronto, Canada.

Vinícius F Galvão, Cristiano Maciel, and Ana Cristina Bicharra Garcia. 2019. Creating chatbots to talk with humans: HCI evaluations and perspectives. In Proceedings of the 18th Brazilian Symposium on Human Factors in Computing Systems (IHC '19). Association for Computing Machinery, New York, NY, USA, 1--11.

Raman Goel, Sachin Vashisht, Armaan Dhanda, and Seba Susan. 2021. An Em-pathetic Conversational Agent with Attentional Mechanism. In 2021 International Conference on Computer Communication and Informatics (ICCCI). 1--4. ISSN: 2329-7190.

Eric P. Green, Yihuan Lai, Nicholas Pearson, Sathyanath Rajasekharan, Michiel Rauws, Angela Joerin, Edith Kwobah, Christine Musyimi, Rachel M. Jones, Chaya Bhat, Antonia Mulinge, and Eve S. Puffer. 2020. Expanding Access to Perinatal Depression Treatment in Kenya Through Automated Psychological Support: Development and Usability Study. (Oct. 2020), e17895. JMIR Formative Research, Toronto, Canada.

Gaurav Kumar Gupta and Dilip Kumar Sharma. 2021. Depression Detection on Social Media with the Aid of Machine Learning Platform: A Comprehensive Survey. In 2021 8th International Conference on Computing for Sustainable Global Development (INDIACom). 658--662.

Becky Inkster, Shubhankar Sarda, and Vinod Subramanian. 2018. An Empathy-Driven, Conversational Artificial Intelligence Agent (Wysa) for Digital Mental Well-Being: Real-World Data Evaluation Mixed-Methods Study. JMIR mHealth and uHealth 6, 11 (Nov. 2018), e12106.

Brian Jack, Timothy Bickmore, Megan Hempstead, Leanne Yinusa-Nyahkoon, Ekaterina Sadikova, Suzanne Mitchell, Paula Gardiner, Fatima Adigun, Brian Penti, Daniel Schulman, and Karla Damus. 2015. Reducing Preconception Risks Among African American Women with Conversational Agent Technology. The Journal of the American Board of Family Medicine 28, 4 (July 2015), 441--451. Publisher: American Board of Family Medicine Section: Original Research.

Lizy Kurian John and Lieven Eeckhout (Eds.). 2017. Performance Evaluation and Benchmarking. CRC Press, Boca Raton.

Alita Joyce. 2020. 7 Steps to Benchmark Your Product's UX.

Robert Boxwell Jr. 1996. Vantagem Competitiva Atraves Do Benchmarking. Mcgraw Hill - Import, São Paulo.

Ahmet Baki Kocaballi, Shlomo Berkovsky, Juan C. Quiroz, Liliana Laranjo, Huong Ly Tong, Dana Rezazadegan, Agustina Briatore, and Enrico Coiera. 2019. The Personalization of Conversational Agents in Health Care: Systematic Review. (Nov. 2019), e15360. Journal of Medical Internet Research, Toronto, Canada.

Rafal Kocielnik, Raina Langevin, James S. George, Shota Akenaga, Amelia Wang, Darwin P. Jones, Alexander Argyle, Callan Fockele, Layla Anderson, Dennis T. Hsieh, Kabir Yadav, Herbert Duber, Gary Hsieh, and Andrea L. Hartzler. 2021. Can I Talk to You about Your Social Needs? Understanding Preference for Conversational User Interface in Health. In CUI 2021 - 3rd Conference on Conversational User Interfaces (CUI '21). Association for Computing Machinery, New York, NY, USA, 1--10.

Jeya Amantha Kumar. 2021. Educational chatbots for project-based learning: investigating learning outcomes for a team-based design course. International Journal of Educational Technology in Higher Education 18, 1 (Dec. 2021), 65.

Liliana Laranjo, Adam G. Dunn, Huong Ly Tong, Ahmet Baki Kocaballi, Jessica Chen, Rabia Bashir, Didi Surian, Blanca Gallego, Farah Magrabi, Annie Y. S. Lau, and Enrico Coiera. 2018. Conversational agents in healthcare: a systematic review. Journal of the American Medical Informatics Association: JAMIA 25, 9 (Sept. 2018), 1248--1258.

Yi-Chieh Lee, Naomi Yamashita, Yun Huang, and Wai Fu. 2020. "I Hear You, I Feel You": Encouraging Deep Self-disclosure through a Chatbot. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (CHI '20). Association for Computing Machinery, New York, NY, USA, 1--12.

Qiao Ying Leong, Shreya Sridhar, Agata Blasiak, Xavier Tadeo, GeckHong Yeo, Alexandria Remus, and Dean Ho. 2022. Characteristics of Mobile Health Platforms for Depression and Anxiety: Content Analysis Through a Systematic Review of the Literature and Systematic Search of Two App Stores. Journal of medical Internet research 24, 2 (2022), e27388.

Yi Joy Li and Hao Irene Luo. 2021. Depression Prevention by Mutual Empathy Training: Using Virtual Reality as a Tool. In 2021 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW). 60--63.

Gale M. Lucas, Albert Rizzo, Jonathan Gratch, Stefan Scherer, Giota Stratou, Jill Boberg, and Louis-Philippe Morency. 2017. Reporting Mental Health Symptoms: Breaking Down Barriers to Care with Virtual Human Interviewers. Frontiers in Robotics and AI 4 (2017). [link]

Joyce H. L. Lui, David K Marcus, and Christopher T. Barry. 2017. Evidence-Based Apps? A Review of Mental Health Mobile Applications in a Psychotherapy Context. Professional Psychology: Research and Practice 48 (2017), 199--210.

Eri Maeda, Akane Miyata, Jacky Boivin, Kyoko Nomura, Yukiyo Kumazawa, Hiromitsu Shirasawa, Hidekazu Saito, and Yukihiro Terada. 2020. Promoting

Stanley John Mierzwa, Samir Souidi, Terry Conroy, Mohammad Abusyed, Hiroki Watarai, and Tammy Allen. 2019. On the Potential, Feasibility, and Effectiveness of Chat Bots in Public Health Research Going Forward. Online Journal of Public Health Informatics 11, 2 (Sep. 2019).

Madison Milne-Ives, Caroline de Cock, Ernest Lim, Melissa Harper Shehadeh, Nick de Pennington, Guy Mole, Eduardo Normando, and Edward Meinert. 2020. The Effectiveness of Artificial Intelligence Conversational Agents in Health Care: Systematic Review. (2020). Journal of Medical Internet Research, Toronto, Canada.

Adam S. Miner, Arnold Milstein, Stephen Schueller, Roshini Hegde, Christina Mangurian, and Eleni Linos. 2016. Smartphone-Based Conversational Agents and Responses to Questions About Mental Health, Interpersonal Violence, and Physical Health. JAMA Internal Medicine 176, 5 (May 2016), 619--625.

Pritika Parmar, Jina Ryu, Shivani Pandya, João Sedoc, and Smisha Agarwal. 2022. Health-focused conversational agents in person-centered care: a review of apps. NPJ digital medicine 5, 1 (Feb. 2022), 21.

Pierre Philip, Jean-Arthur Micoulaud-Franchi, Patricia Sagaspe, Etienne De Sevin, Jérôme Olive, Stéphanie Bioulac, and Alain Sauteraud. 2017. Virtual human as a new diagnostic tool, a proof of concept study in the field of major depressive disorders. Scientific Reports 7, 1 (Feb. 2017), 42656. Number: 1 Publisher: Nature Publishing Group.

Raquel Planas and Oriol Yuguero. 2021. Technological prescription: evaluation of the effectiveness of mobile applications to improve depression and anxiety. Systematic review. Informatics for Health and Social Care 46, 3 (2021), 273--290.

Simon Provoost, Ho Ming Lau, Jeroen Ruwaard, and Heleen Riper. 2017. Embodied Conversational Agents in Clinical Psychology: A Scoping Review. Journal of Medical Internet Research 19, 5 (May 2017), e151.

Chengcheng Qu, Corina Sas, Claudia Daudén Roquet, and Gavin Doherty. 2020. Functionality of Top-Rated Mobile Apps for Depression: Systematic Search and Evaluation. JMIR Ment Health 7, 1 (24 Jan 2020), e15321.

Anand Singh Rajawat, Romil Rawat, Kanishk Barhanpurkar, Rabindra Nath Shaw, and Ankush Ghosh. 2021. Chapter Five - Depression detection for elderly people using AI robotic systems leveraging the Nelder-Mead Method. In Artificial Intelligence for Future Generation Robotics, Rabindra Nath Shaw, Ankush Ghosh, Valentina E. Balas, and Monica Bianchini (Eds.). Elsevier, 55--70.

Margherita Rampioni, Vera Stara, Elisa Felici, Lorena Rossi, and Susy Paolini. 2021. Embodied Conversational Agents for Patients With Dementia: Thematic Literature Analysis. JMIR mHealth and uHealth 9, 7 (July 2021), e25381.

Natasha Randall, Selma Šabanović, and Wynnie Chang. 2018. Engaging Older Adults with Depression as Co-Designers of Assistive In-Home Robots. In Proceedings of the 12th EAI International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth '18). Association for Computing Machinery, New York, NY, USA, 304--309.

Damian F Santomauro, Ana M Mantilla Herrera, Jamileh Shadid, Peng Zheng, Charlie Ashbaugh, David M Pigott, Cristiana Abbafati, Christopher Adolph, Joanne O Amlag, Aleksandr Y Aravkin, et al. 2021. Global prevalence and burden of depressive and anxiety disorders in 204 countries and territories in 2020 due to the COVID-19 pandemic. The Lancet 398, 10312 (2021), 1700--1712.

Victon Malcolm R. Santos, Francisco Petrônio A. Medeiros, Heremita B. Lira, and Nadja Nóbrega Rodrigues. 2021. Benchmark Application for Scenario Analysis in the Educational Chatbots Development. In 2021 XVI Latin American Conference on Learning Technologies (LACLO). 302--309.

Maria J Serrano-Ripoll, Rocío Zamanillo-Campos, Maria A Fiol-DeRoque, Adoración Castro, and Ignacio Ricci-Cabello. 2022. Impact of Smartphone App-Based Psychological Interventions for Reducing Depressive Symptoms in People With Depression: Systematic Literature Review and Meta-analysis of Randomized Controlled Trials. JMIR mHealth and uHealth 10, 1 (2022), e29621.

Michael J. Spendolini. 1992. The Benchmarking Book. Amacom Books, New York.

Katarzyna Stawarz, Chris Preist, Debbie Tallon, Nicola Wiles, and David Coyle. 2018. User Experience of Cognitive Behavioral Therapy Apps for Depression: An Analysis of App Functionality and User Reviews. J Med Internet Res 20, 6 (06 Jun 2018), e10120.

Colleen Stiles-Shields, Enid Montague, Mary J. Kwasny, and David C. Mohr. 2019. Behavioral and cognitive intervention strategies delivered via coached apps for depression: Pilot trial. Psychological Services 16, 2 (May 2019), 233--238.

Colleen Stiles-Shields, Enid Montague, Emily G. Lattie, Stephen M. Schueller, Mary J. Kwasny, and David C. Mohr. 2017. Exploring User Learnability and Learning Performance in an App for Depression: Usability Study. JMIR human factors 4, 3 (Aug. 2017), e18.

John Torous, Sandra Bucci, Imogen H Bell, Lars V Kessing, Maria Faurholt-Jepsen, Pauline Whelan, Andre F Carvalho, Matcheri Keshavan, Jake Linardon, and Joseph Firth. 2021. The growing field of digital psychiatry: current evidence and the future of apps, social media, chatbots, and virtual reality. World Psychiatry 20, 3 (2021), 318--335.

Md Zia Uddin, Kim Kristoffer Dysthe, Asbjørn Følstad, and Petter Bae Brandtzaeg. 2022. Deep learning for prediction of depressive symptoms in a large textual dataset. Neural Computing and Applications 34, 1 (Jan. 2022), 721--744.

Francisco Albernaz Machado Valério, Tatiane Gomes Guimarães, Raquel Oliveira Prates, and Heloisa Candello. 2018. Chatbots Explain Themselves: Designers' Strategies for Conveying Chatbot Features to Users. Journal on Interactive Systems 9, 3 (Dec. 2018). Number: 3.

Caroline Wachtler, Amy Coe, Sandra Davidson, Susan Fletcher, Antonette Mendoza, Leon Sterling, and Jane Gunn. 2018. Development of a Mobile Clinical Prediction Tool to Estimate Future Depression Severity and Guide Treatment in Primary Care: User-Centered Design. JMIR mHealth and uHealth 6, 4 (April 2018), e95.

Robert Whitaker. 2017. Anatomia de uma Epidemia: pílulas mágicas, drogas psiquiátricas e o aumento assombroso da doença mental. Fiocruz, Rio de Janeiro.

WHO. 2021. World Health Organization - Depression. Retrieved May 3, 2022 from
DE SOUZA, Paula Maia; PIRES, Isabella da Costa; MOTTI, Vivian Genaro; CASELI, Helena Medeiros; BARBOSA NETO, Jair; MARTINI, Larissa C.; NERIS, Vânia Paula de Almeida. Design recommendations for chatbots to support people with depression. In: SIMPÓSIO BRASILEIRO SOBRE FATORES HUMANOS EM SISTEMAS COMPUTACIONAIS (IHC), 21. , 2022, Diamantina. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 .