Selected Publications
- Sarma, K. V., Hanss, K. E., Halls, A. J. M., Krystal, A., Becker, D. F., Glowinski, A. L., Butte, A. J.. "Integrating Expert Knowledge into Large Language Models Improves Performance for Psychiatric Reasoning and Diagnosis." Psychiatry Research; 2025; [pdf]
- Hanss, K. E., Sarma, K. V., Glowinski, A. L., Krystal, A., Saunders, R., Halls, A. J. M., Gorrell, S., Reilly, E.. "Competence or confidence? Assessing the accuracy, reliability, and confidence of large language models in psychiatry." JMIR; 2025; PMID: 40392576; [pdf]
- Tumpa, Z. N., Zawad, M. R. S., Sollis, L., Parab, S., Chen, I. Y., Washington, P.. "Quantifying device type and handedness biases in a remote Parkinson's disease AI-powered assessment." NPJ Digital Medicine; 2025; [pdf]
- Pierre, J. M., Gaeta, B., Raghavan, G., Sarma, K. V.. ""You're Not Crazy": A Case of New Onset AI-Associated Psychosis." Innovations in Clinical Neuroscience; 2025; [pdf]
- Sarma, K. V., Hanss, K. E., Halls, A., Becker, D., Glowinski, A., Krystal, A.. "Simulated Reasoning and Self-Verification in Generalist Large Language Models for Psychiatric Diagnostic Performance: Cross-Sectional Study." medRxiv [Preprint]; 2025; [pdf]
- Tang, A. S., Zeng, B. Z. D., Rankin, K. P., Miller, B., Gorno-Tempini, M. L., Seeley, W. W., Rosen, H. J., Rabinovici, G. D., Oskotsky, T. T., Sirota, M., Pinheiro-Chagas, P.. "Characterizing Dementia Phenotypes from Unstructured EHR Notes with Generative AI and Interpretable Machine Learning." medRxiv; 2025; [pdf]
- Gallingani, C., Miller, Z. A., Mandelli, M. L., Rosen, H. J., Ezzes, Z., Lin, M., Rodriguez, D., Seeley, W. W., Miller, B., Gorno-Tempini, M. L., Pinheiro-Chagas, P.. "Agentic Generative Artificial Intelligence System for Classification of Pathology-Confirmed Primary Progressive Aphasia Variants." medRxiv; 2025; [pdf]
- Pendse, S. R., Gergle, D., Kornfield, R., Meyerhoff, J., Mohr, D., Suh, J., Wescott, A., Williams, C., Schleider, J.. "When Testing AI Tests Us: Safeguarding Mental Health on the Digital Frontlines." ACM Conference on Fairness, Accountability, and Transparency (FAccT); 2025; [pdf]
- Song, I., Pendse, S. R., Kumar, N., De Choudhury, M.. "The Typing Cure: Experiences with Large Language Model Chatbots for Mental Health Support." Proceedings of the ACM on Human-Computer Interaction (PACM HCI), CSCW; 2025; [pdf]
- Pendse, S. R., Rochford, B., Kumar, N., De Choudhury, M.. "The Role of Partisan Culture in Mental Health Language Online." Proceedings of the ACM on Human-Computer Interaction (PACM HCI), CSCW; 2025; [pdf]
- Pendse, S. R., Jain, M., Kumar, N., De Choudhury, M.. "Implicit Gender, Racial, and Ethnic Biases in Large Language Models: An Audit Study of Automated Psychiatric Diagnoses." Preprint; 2025; [pdf]
- Langfus J, Hanss K, Chung S, Nili A, Haack L, Pfiffner L.. "Leveraging Large Language Models to Code Content Fidelity in Virtual School-Based Behavioral Parent Training." International Society for Research in Child and Adolescent Psychopathology Biennial Meeting; 2025
- Sarma, K. V., Hanss, K. E., Galatzer-Levy, I. R., Tolou-Shams, M.. "Dr. AI Will See You Now: The Opportunities, Challenges, and Risks of ChatGPT, Gemini, and Other Large Language Models in Psychiatry." APA Annual Meeting; 2025
- Sarma, K. V., Hanss, K. E., Glowinski, A. L., Krystal, A., Halls, A. J. M., Butte, A. J.. "The Robo-Doctor is Always In: Assessing and Comparing the Psychiatric Diagnostic Capabilities of ChatGPT and other Large Language Models." APA Annual Meeting; 2025
- Hanss, K., Sarma, K. V., Halls, A., Gorrell, S., Reilly, E.. "Can Artificial Intelligence Make the Diagnosis? Evaluating the Accuracy of Large Language Models in Diagnosing Child and Adolescent Psychiatry Clinical Cases." Journal of the American Academy of Child & Adolescent Psychiatry; 2024; [pdf]
- Suh, J., Pendse, S. R., Lewis, R., Howe, E., Saha, K., Okoli, E., Amores, J., Ramos, G., Shen, J., Borghouts, J., Sharma, A., Pedrelli, P., Friedman, L., Jackman, C., Benhalim, Y., Ong, D. C., Segal, S., Althoff, T., Czerwinski, M.. "Rethinking technology innovation for mental health: framework for multi-sectoral collaboration." Nature Mental Health; 2024; [pdf]
- Yoo, D. W., Woo, H., Pendse, S. R., Lu, N. Y., Birnbaum, M. L., Abowd, G. D., De Choudhury, M.. "Missed Opportunities for Human-Centered AI Research: Understanding Stakeholder Collaboration in Mental Health AI Research." Proceedings of the ACM on Human-Computer Interaction; 2024; [pdf]
- Pendse, S. R., Stapleton, L., Kumar, N., De Choudhury, M., Chancellor, S.. "Advancing a consent-forward paradigm for digital mental health data." Nature Mental Health; 2024
- Pendse, S. R., Kumar, N., De Choudhury, M.. "Quantifying the Pollan Effect: Investigating the Impact of Emerging Psychiatric Interventions on Online Mental Health Discourse." Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems; 2024
- Pendse, S. R., Massachi, T., Mahdavimoghaddam, J., Butler, J., Suh, J., Czerwinski, M.. "Towards Inclusive Futures for Worker Wellbeing." Proceedings of the ACM on Human-Computer Interaction (PACM HCI), CSCW; 2024
- Jaiswal, A., Wall, D. P., Washington, P.. "Challenges in the Differential Classification of Individual Diagnoses from Co-Occurring Autism and ADHD Using Survey Data." IEEE EMBS International Conference on Biomedical Health Informatics; 2024; [pdf]
- Jaiswal, A., Shah, A., Harjadi, C., Windgassen, E., Washington, P.. "Ethics of the Use of Social Media as Training Data for AI Models Used for Digital Phenotyping." JMIR Formative Research; 2024; [pdf]
- Jaiswal, A., Washington, P.. "Using #ActuallyAutistic on Twitter for Precision Diagnosis of Autism Spectrum Disorder: Machine Learning Study." JMIR Formative Research; 2024; [pdf]
- Washington, P.. "Digitally Diagnosing Multiple Developmental Delays Using Crowdsourcing Fused With Machine Learning." JMIR Research Protocols; 2024; [pdf]
- Li, J., Washington, P.. "A Comparison of Personalized and Generalized Approaches to Emotion Recognition Using Consumer Wearable Devices: Machine Learning Study." JMIR AI; 2024; [pdf]
- Sarma, K. V.*, Hanss, K. E.*, Glowinski, A. L., Butte, A. J., Halls, A. J. M.. "Can Large Language Models Reason about Behavioral Health? Evaluating the Psychiatric Knowledge Base and Reasoning Capabilities of GPT-4." UCSF Health Services Research Symposium; 2024
- He, C. X, Sarma, K. V., Hu, R.. "The Ethics of Artificial Intelligence in Psychiatry: A Beginner's Exploration.." APA Mental Health Services Conference; 2024
- Sarma, K. V., Hanss, K. E., Glowinski, A. L., Krystal, A., Halls, A. J. M., Butte, A. J.. "Can Large Language Model-based AI Reason about Behavioral Health? Preliminary Evaluation of a Decision Tree-Based LLM Algorithm for Psychiatric Case Diagnosis." ACNP 63rd Annual Meeting; 2024; PMID: 39549698; [pdf]
- Sarma, K. V., Hanss, K. E., Glowinski, A. L., Butte, A. J., Halls, A. J. M.. "Improving the Performance of LLM-Based Semi-Automated Psychiatric Case Diagnosis using Decision Tree-Based Prompting." AMIA Annual Meeting; 2024
- Sarma, K. V., Hanss, K. E., Elkin, D., Halls, A.. "AI vs. DSM -- Can It Make the Diagnosis? Measuring GPT-4's Psychiatric Knowledge and Reasoning." UCSF School of Medicine Leadership Retreat; 2024
- Hanss, K. E.*, Sarma, K. V.*, Saunders, R., Elkin, D.. "Grading the Machine: Assessing ChatGPT's Psychiatric Knowledge through Boards-Style Assessment." APA Annual Meeting; 2024
- Washington, P., Wall, D. P.. "A Review of and Roadmap for Data Science and Machine Learning for the Neuropsychiatric Phenotype of Autism." Annual Review of Biomedical Data Science; 2023; [pdf]