Attention: Restrictions on use of AUA, AUAER, and UCF content in third party applications, including artificial intelligence technologies, such as large language models and generative AI.
You are prohibited from using or uploading content you accessed through this website into external applications, bots, software, or websites, including those using artificial intelligence technologies and infrastructure, including deep learning, machine learning and large language models and generative AI.

JU INSIGHT Artificial Intelligence-powered Large Language Models to Disseminate Health Information in Urology

By: Ryan J. Davis, BS, Keck School of Medicine, University of Southern California, Los Angeles, Artificial Intelligence Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles; Michael B. Eppler, BA, Keck School of Medicine, University of Southern California, Los Angeles, Artificial Intelligence Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles; Oluwatobiloba Ayo-Ajibola, BS, Keck School of Medicine, University of Southern California, Los Angeles, Artificial Intelligence Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles; Jeffrey C. Loh-Doyle, MD, Keck School of Medicine, University of Southern California, Los Angeles, Artificial Intelligence Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles; Jamal Nabhani, MD, Keck School of Medicine, University of Southern California, Los Angeles, Artificial Intelligence Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles; Mary Samplaski, MD, Keck School of Medicine, University of Southern California, Los Angeles, Artificial Intelligence Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles; Inderbir S. Gill, MD, Keck School of Medicine, University of Southern California, Los Angeles, Artificial Intelligence Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles; Giovanni E. Cacciamani, MD, Keck School of Medicine, University of Southern California, Los Angeles, Artificial Intelligence Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles | Posted on: 25 Oct 2023

Davis R, Eppler M, Ayo-Ajibola O, et al. Evaluating the effectiveness of artificial intelligence-powered large language models application in disseminating appropriate and readable health information in urology. J Urol. 2023;210(4):688-694.

Study Need and Importance

In 2022, Version 3.5 of ChatGPT, an artificial intelligence-powered large language model (LLM) was released. Its adoption immediately burgeoned, and given that patients most commonly use the Internet as a primary medical information source, there is reason to believe they will adopt ChatGPT for medical information too. Urological patients may be particularly likely to use ChatGPT, as situations requiring urological care are broad ranging, with diverse treatment options from office procedures to major open surgery. No study has been done to assess ChatGPT’s urological advice, and thus our study aimed to do so by assessing appropriateness, readability, and other qualities of ChatGPT-generated urological information.

What We Found

The Figure contains a flowchart of the study methodology. Fourteen of 18 (77.8%) responses were deemed appropriate. No significant differences were found between treatment- and symptom-related questions, nor between oncologic, benign, and treatment-related questions. The most common reason from urologist-graders for low scores was missing information. The most concerning missing information includes a missed differential diagnosis of acute retention for a classic presentation of acute retention. The mean (SD) Flesch Reading Ease score was 35.5 (10.2), and the mean Flesh-Kincaid Reading Grade Level score was 13.5 (1.74), indicating college-level readability of responses.

image
Figure. Flowchart of study methodology. We pretended to be laypeople searching ChatGPT to find medical information about urological conditions in terms of diagnosis, treatment, and referral. Our team evaluated the completeness, accuracy, and readability of this information.

Limitations

We used a small sample of questions and physician-graders. ChatGPT responses to the same question change each time the question is asked. ChatGPT is only 1 artificial intelligence-powered LLM of others that are available. Finally, ChatGPT is not designed specifically for medical use in mind.

Interpretation for Patient Care

While the advent of LLMs is an exciting prospect for bridging the information gap between urology patients and time-constrained physicians, they may give inappropriate advice, inappropriately triage emergent medical situations, and write at too high a readability level for the average patient. Patients should proceed to use with caution.

advertisement

advertisement