Large language models outperform mental and medical health care professionals in identifying obsessive-compulsive disorder
Beam, K. et al. Performance of a large language model on practice questions for the neonatal board examination. JAMA Pediatr. 177, 977–979 (2023).
Google Scholar
Cai, Z. R. et al. Assessment of correctness, content omission, and risk of harm in large language model responses to dermatology continuing medical education questions. J. Invest. Dermatol. (2024).
Lyons, R. J., Arepalli, S. R., Fromal, O., Choi, J. D. & Jain, N. Artificial intelligence chatbot performance in triage of ophthalmic conditions. Can. J. Ophthalmol. (2023).
Chen, S. et al. Use of artificial intelligence chatbots for cancer treatment information. JAMA Oncol. 9, 1459–1462 (2023).
Google Scholar
Strong, E. et al. Chatbot vs medical student performance on free-response clinical reasoning examinations. JAMA Intern. Med. 183, 1028–1030 (2023).
Google Scholar
Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023).
Google Scholar
Sallam, M. ChatGPT utility in healthcare education, research, and practice: systematic review on the promising perspectives and valid concerns. Healthcare 11, 887 (2023).
Google Scholar
Psychiatry.org. The basics of augmented intelligence: some factors psychiatrists need to know now. (2023).
Blease, C., Worthen, A. & Torous, J. Psychiatrists’ experiences and opinions of generative artificial intelligence in mental healthcare: An online mixed methods survey. Psychiatry Res. 333, 115724 (2024).
Google Scholar
APA. What is obsessive-compulsive disorder? (2022).
National Institute of Mental Health (NIMH). Obsessive-compulsive disorder (OCD). (2022).
National Comorbidity Survey (NCSSC). Harvard Medical School. (2007)
Pinto, A., Mancebo, M. C., Eisen, J. L., Pagano, M. E. & Rasmussen, S. A. The brown longitudinal obsessive compulsive study: clinical features and symptoms of the sample at intake. J. Clin. Psychiatry 67, 703–711 (2006).
Google Scholar
Perris, F. et al. Duration of untreated illness in patients with obsessive–compulsive disorder and its impact on long-term outcome: a systematic review. J. Pers. Med. 13, 1453 (2023).
Google Scholar
Galido, P. V., Butala, S., Chakerian, M. & Agustines, D. A case study demonstrating applications of ChatGPT in the clinical management of treatment-resistant schizophrenia. Cureus 15, e38166 (2023).
Cohan, A. et al. SMHD: a large-scale resource for exploring online language usage for multiple mental health conditions. In Proc. 27th International Conference on Computational Linguistics (eds. Bender, E. M., Derczynski, L. & Isabelle, P.) 1485–1497 (Association for Computational Linguistics, 2018).
Xu, X. et al. Leveraging large language models for mental health prediction via online text data. In Proc. ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (Association for Computing Machinery, 2023).
Galatzer-Levy, I. R., McDuff, D., Natarajan, V., Karthikesalingam, A. & Malgaroli, M. The capability of large language models to measure psychiatric functioning. Preprint at (2023).
Levkovich, I. & Elyoseph, Z. Identifying depression and its determinants upon initiating treatment: ChatGPT versus primary care physicians. Fam. Med. Community Health 11, e002391 (2023).
Google Scholar
Usage policies. (2024).
Lucas, G. M., Gratch, J., King, A. & Morency, L. P. It’s only a computer: virtual humans increase willingness to disclose. Comput. Hum. Behav. 37, 94–100 (2014).
Google Scholar
Elyoseph, Z., Hadar-Shoval, D., Asraf, K. & Lvovsky, M. ChatGPT outperforms humans in emotional awareness evaluations. Front. Psychol. 14, 1199058 (2023).
House, T. W. FACT SHEET: President Biden issues executive order on safe, secure, and trustworthy artificial intelligence. The White House (2023).
Glazier, K., Swing, M. & McGinn, L. K. Half of obsessive-compulsive disorder cases misdiagnosed: vignette-based survey of primary care physicians. J. Clin. Psychiatry 76, e761–e767 (2015).
Google Scholar
Gouniai, J. M., Smith, K. D. & Leonte, K. G. Do clergy recognize and respond appropriately to the many themes in obsessive-compulsive disorder?: data from a Pacific Island community. Ment. Health Relig. Cult. 25, 33–46 (2022).
Google Scholar
Gouniai, J. M., Smith, K. D. & Leonte, K. G. Many common presentations of obsessive-compulsive disorder unrecognized by medical providers in a Pacific Island community. J. Ment. Health Train. Educ. Pract. 17, 419–428 (2022).
Google Scholar
Glazier, K., Calixte, R. M., Rothschild, R. & Pinto, A. High rates of OCD symptom misidentification by mental health professionals. Ann. Clin. Psychiatry 25, 201–209.
Glazier, K. & McGinn, L. K. Non-contamination and non-symmetry OCD obsessions are commonly not recognized by clinical, counseling and school psychology doctoral students. J. Depress. Anxiety 04 (2015).
Kim, J., Cai, Z. R., Chen, M. L., Simard, J. F. & Linos, E. Assessing biases in medical decisions via clinician and AI Chatbot responses to patient vignettes. JAMA Netw. Open 6, E2338050 (2023).
Google Scholar
Wang, J. et al. Prompt engineering for healthcare: methodologies and applications. Preprint at (2024).
Savage, T., Nayak, A., Gallo, R., Rangan, E. & Chen, J. H. Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine. Npj Digit. Med. 7, 1–7 (2024).
Google Scholar
link