Selected Publications


  1. Sarma, K. V., Hanss, K. E., Halls, A. J. M., Krystal, A., Becker, D. F., Glowinski, A. L., Butte, A. J.. "Integrating Expert Knowledge into Large Language Models Improves Performance for Psychiatric Reasoning and Diagnosis." Psychiatry Research; 2025; [pdf]

  2. Hanss, K. E., Sarma, K. V., Glowinski, A. L., Krystal, A., Saunders, R., Halls, A. J. M., Gorrell, S., Reilly, E.. "Competence or confidence? Assessing the accuracy, reliability, and confidence of large language models in psychiatry." JMIR; 2025; PMID: 40392576; [pdf]

  3. Tumpa, Z. N., Zawad, M. R. S., Sollis, L., Parab, S., Chen, I. Y., Washington, P.. "Quantifying device type and handedness biases in a remote Parkinson's disease AI-powered assessment." NPJ Digital Medicine; 2025; [pdf]

  4. Pierre, J. M., Gaeta, B., Raghavan, G., Sarma, K. V.. ""You're Not Crazy": A Case of New Onset AI-Associated Psychosis." Innovations in Clinical Neuroscience; 2025; [pdf]

  5. Sarma, K. V., Hanss, K. E., Halls, A., Becker, D., Glowinski, A., Krystal, A.. "Simulated Reasoning and Self-Verification in Generalist Large Language Models for Psychiatric Diagnostic Performance: Cross-Sectional Study." medRxiv [Preprint]; 2025; [pdf]

  6. Tang, A. S., Zeng, B. Z. D., Rankin, K. P., Miller, B., Gorno-Tempini, M. L., Seeley, W. W., Rosen, H. J., Rabinovici, G. D., Oskotsky, T. T., Sirota, M., Pinheiro-Chagas, P.. "Characterizing Dementia Phenotypes from Unstructured EHR Notes with Generative AI and Interpretable Machine Learning." medRxiv; 2025; [pdf]

  7. Gallingani, C., Miller, Z. A., Mandelli, M. L., Rosen, H. J., Ezzes, Z., Lin, M., Rodriguez, D., Seeley, W. W., Miller, B., Gorno-Tempini, M. L., Pinheiro-Chagas, P.. "Agentic Generative Artificial Intelligence System for Classification of Pathology-Confirmed Primary Progressive Aphasia Variants." medRxiv; 2025; [pdf]

  8. Pendse, S. R., Gergle, D., Kornfield, R., Meyerhoff, J., Mohr, D., Suh, J., Wescott, A., Williams, C., Schleider, J.. "When Testing AI Tests Us: Safeguarding Mental Health on the Digital Frontlines." ACM Conference on Fairness, Accountability, and Transparency (FAccT); 2025; [pdf]

  9. Song, I., Pendse, S. R., Kumar, N., De Choudhury, M.. "The Typing Cure: Experiences with Large Language Model Chatbots for Mental Health Support." Proceedings of the ACM on Human-Computer Interaction (PACM HCI), CSCW; 2025; [pdf]

  10. Pendse, S. R., Rochford, B., Kumar, N., De Choudhury, M.. "The Role of Partisan Culture in Mental Health Language Online." Proceedings of the ACM on Human-Computer Interaction (PACM HCI), CSCW; 2025; [pdf]

  11. Pendse, S. R., Jain, M., Kumar, N., De Choudhury, M.. "Implicit Gender, Racial, and Ethnic Biases in Large Language Models: An Audit Study of Automated Psychiatric Diagnoses." Preprint; 2025; [pdf]

  12. Langfus J, Hanss K, Chung S, Nili A, Haack L, Pfiffner L.. "Leveraging Large Language Models to Code Content Fidelity in Virtual School-Based Behavioral Parent Training." International Society for Research in Child and Adolescent Psychopathology Biennial Meeting; 2025

  13. Sarma, K. V., Hanss, K. E., Galatzer-Levy, I. R., Tolou-Shams, M.. "Dr. AI Will See You Now: The Opportunities, Challenges, and Risks of ChatGPT, Gemini, and Other Large Language Models in Psychiatry." APA Annual Meeting; 2025

  14. Sarma, K. V., Hanss, K. E., Glowinski, A. L., Krystal, A., Halls, A. J. M., Butte, A. J.. "The Robo-Doctor is Always In: Assessing and Comparing the Psychiatric Diagnostic Capabilities of ChatGPT and other Large Language Models." APA Annual Meeting; 2025

  15. Hanss, K., Sarma, K. V., Halls, A., Gorrell, S., Reilly, E.. "Can Artificial Intelligence Make the Diagnosis? Evaluating the Accuracy of Large Language Models in Diagnosing Child and Adolescent Psychiatry Clinical Cases." Journal of the American Academy of Child & Adolescent Psychiatry; 2024; [pdf]

  16. Suh, J., Pendse, S. R., Lewis, R., Howe, E., Saha, K., Okoli, E., Amores, J., Ramos, G., Shen, J., Borghouts, J., Sharma, A., Pedrelli, P., Friedman, L., Jackman, C., Benhalim, Y., Ong, D. C., Segal, S., Althoff, T., Czerwinski, M.. "Rethinking technology innovation for mental health: framework for multi-sectoral collaboration." Nature Mental Health; 2024; [pdf]

  17. Yoo, D. W., Woo, H., Pendse, S. R., Lu, N. Y., Birnbaum, M. L., Abowd, G. D., De Choudhury, M.. "Missed Opportunities for Human-Centered AI Research: Understanding Stakeholder Collaboration in Mental Health AI Research." Proceedings of the ACM on Human-Computer Interaction; 2024; [pdf]

  18. Pendse, S. R., Stapleton, L., Kumar, N., De Choudhury, M., Chancellor, S.. "Advancing a consent-forward paradigm for digital mental health data." Nature Mental Health; 2024

  19. Pendse, S. R., Kumar, N., De Choudhury, M.. "Quantifying the Pollan Effect: Investigating the Impact of Emerging Psychiatric Interventions on Online Mental Health Discourse." Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems; 2024

  20. Pendse, S. R., Massachi, T., Mahdavimoghaddam, J., Butler, J., Suh, J., Czerwinski, M.. "Towards Inclusive Futures for Worker Wellbeing." Proceedings of the ACM on Human-Computer Interaction (PACM HCI), CSCW; 2024

  21. Jaiswal, A., Wall, D. P., Washington, P.. "Challenges in the Differential Classification of Individual Diagnoses from Co-Occurring Autism and ADHD Using Survey Data." IEEE EMBS International Conference on Biomedical Health Informatics; 2024; [pdf]

  22. Jaiswal, A., Shah, A., Harjadi, C., Windgassen, E., Washington, P.. "Ethics of the Use of Social Media as Training Data for AI Models Used for Digital Phenotyping." JMIR Formative Research; 2024; [pdf]

  23. Jaiswal, A., Washington, P.. "Using #ActuallyAutistic on Twitter for Precision Diagnosis of Autism Spectrum Disorder: Machine Learning Study." JMIR Formative Research; 2024; [pdf]

  24. Washington, P.. "Digitally Diagnosing Multiple Developmental Delays Using Crowdsourcing Fused With Machine Learning." JMIR Research Protocols; 2024; [pdf]

  25. Li, J., Washington, P.. "A Comparison of Personalized and Generalized Approaches to Emotion Recognition Using Consumer Wearable Devices: Machine Learning Study." JMIR AI; 2024; [pdf]

  26. Sarma, K. V.*, Hanss, K. E.*, Glowinski, A. L., Butte, A. J., Halls, A. J. M.. "Can Large Language Models Reason about Behavioral Health? Evaluating the Psychiatric Knowledge Base and Reasoning Capabilities of GPT-4." UCSF Health Services Research Symposium; 2024

  27. He, C. X, Sarma, K. V., Hu, R.. "The Ethics of Artificial Intelligence in Psychiatry: A Beginner's Exploration.." APA Mental Health Services Conference; 2024

  28. Sarma, K. V., Hanss, K. E., Glowinski, A. L., Krystal, A., Halls, A. J. M., Butte, A. J.. "Can Large Language Model-based AI Reason about Behavioral Health? Preliminary Evaluation of a Decision Tree-Based LLM Algorithm for Psychiatric Case Diagnosis." ACNP 63rd Annual Meeting; 2024; PMID: 39549698; [pdf]

  29. Sarma, K. V., Hanss, K. E., Glowinski, A. L., Butte, A. J., Halls, A. J. M.. "Improving the Performance of LLM-Based Semi-Automated Psychiatric Case Diagnosis using Decision Tree-Based Prompting." AMIA Annual Meeting; 2024

  30. Sarma, K. V., Hanss, K. E., Elkin, D., Halls, A.. "AI vs. DSM -- Can It Make the Diagnosis? Measuring GPT-4's Psychiatric Knowledge and Reasoning." UCSF School of Medicine Leadership Retreat; 2024

  31. Hanss, K. E.*, Sarma, K. V.*, Saunders, R., Elkin, D.. "Grading the Machine: Assessing ChatGPT's Psychiatric Knowledge through Boards-Style Assessment." APA Annual Meeting; 2024

  32. Washington, P., Wall, D. P.. "A Review of and Roadmap for Data Science and Machine Learning for the Neuropsychiatric Phenotype of Autism." Annual Review of Biomedical Data Science; 2023; [pdf]