ARTIFICIAL INTELLIGENCE Shifting From Dr Google to Dr GPT: The Potential Impact on Patient Safety of Changing e-Providers

By: Stacy Loeb, MD, MSc, PhD (Hon), New York University Langone Health and Manhattan Veterans Affairs, New York | Posted on: 05 Jan 2024

Over 90% of US adults use the internet, and there is a substantial amount of online content about health topics. Unfortunately, substantial limitations have been identified with urological information on the internet. Major problems include insufficient content presented at the recommended 6th grade reading level for consumer health information and a high prevalence of circulating misinformation. For example, in a series of studies evaluating information about prostate cancer on Instagram and TikTok, we reported misinformation in 40% to 41% of the content containing objective information.^1,2 Even the websites of National Cancer Institute–designated cancer centers, on average, provided sufficient information to answer only 19% of key questions for prostate cancer decision-making.³ These issues are not unique to prostate cancer, with studies showing a substantial amount of poor-quality content about a range of benign and malignant urological conditions across different online platforms.^4-6 This leaves a lot of room for improvement from the “care” that Dr Google has been providing our patients to date.

The key question is whether Dr ChatGPT can improve upon this and provide better advice to our patients. Our group has published several studies on the quality of consumer health information from ChatGPT and other artificial intelligence (AI) chatbots. First, we examined information about the most common urological cancers (prostate, bladder, kidney, and testicular cancer) from ChatGPT, Perplexity, Chat Sonic, and Microsoft Bing AI.⁷ Using the top 5 Google search queries about each cancer as prompts, we found that AI chatbot responses were generally good quality (median score of 4 out of 5 on the validated DISCERN instrument) and lacked misinformation. However, actionability of the responses was poor (median actionability score of 40% out of 100% on the validated Patient Education Materials Assessment Tool [PEMAT] instrument).

Using similar methods, we compared information from the 4 AI chatbots (ChatGPT, Perplexity, Chat Sonic, and Microsoft Bing AI) related to the most common 5 cancers in the US (skin, lung, breast, colorectal, and prostate cancer).⁸ The top 5 Google search queries about each cancer were used as prompts. The quality of text responses generated by the AI chatbots was high (median score of 5 out of 5 on the validated DISCERN instrument); however, actionability was poor (median score of 20% out of 100% on the validated PEMAT instrument) and responses were written at a college reading level.

More recently, we examined information about erectile dysfunction from ChatGPT, Perplexity, Chat Sonic, and Microsoft Bing AI.⁹ Using the top 5 Google search queries and headings from the National Institute of Diabetes and Digestive and Kidney Diseases website as inputs, we found that the quality of information was high (median score of 4 out of 5 on the validated DISCERN instrument) but actionability was low (median score of 20% out of 100% on the validated PEMAT instrument) and responses were written at a median Flesch-Kincaid grade level of 14.

Similarly, Davis et al examined the responses of ChatGPT to 18 patient questions about signs/symptoms or treatment for benign, oncologic, and emergency urology topics.¹⁰ Overall, the majority of responses (77.8%) were deemed appropriate; however, the information was presented at a mean grade level of 13.5.

These preliminary findings suggest that ChatGPT and other AI chatbots may provide higher quality information than many other online sources and appear less likely to spread misinformation. However, the information is not readily actionable and is written above the recommended reading level for consumer health information. Therefore, we have yet to identify the optimal “e-provider” with high-quality information that is also actionable and understandable for lay health consumers. In the meantime, it is prudent to provide patients with a list of vetted resources of additional information about their condition.

Xu AJ, Taylor J, Gao T, Mihalcea R, Perez-Rosas V, Loeb S. TikTok and prostate cancer: misinformation and quality of information using validated questionnaires. BJU Int. 2021;128(4):435-437.
Xu AJ, Myrie A, Taylor JI, et al. Instagram and prostate cancer: using validated instruments to assess the quality of information on social media. Prostate Cancer Prostatic Dis. 2022;25(4):791-793.
Dulaney C, Barrett OC, Rais-Bahrami S, Wakefield D, Fiveash J, Dobelbower M. Quality of prostate cancer treatment information on cancer center websites. Cureus. 2016;8(4):e580.
Loeb S, Taylor J, Borin JF, et al. Fake news: spread of misinformation about urological conditions on social media. Eur Urol Focus. 2020;6(3):437-439.
Dubin JM, Aguiar JA, Lin JS, et al. The broad reach and inaccuracy of men’s health information on social media: analysis of TikTok and Instagram. Int J Impot Res. 2022;1-5.
Kanner J, Waghmarae S, Nemirovsky A, Wang S, Loeb S, Malik R. TikTok and YouTube videos on overactive bladder exhibit poor quality and diversity. Urol Pract. 2023;10(5):493-500.
Musheyev D, Pan A, Loeb S, Kabarriti AE. How well do artificial intelligence chatbots respond to the top search queries about urological malignancies?. Eur Urol. 2023; S0302-2838(23)02972-X.
Pan A, Musheyev D, Bockelman D, Loeb S, Kabarriti AE. Assessment of artificial intelligence chatbot responses to top searched queries about cancer. JAMA Oncol. 2023;9(10):1437-1440.
Pan A, Musheyev D, Loeb S, Kabarriti AE. Quality of erectile dysfunction information from ChatGPT and other artificial intelligence chatbots. BJU Int. 2023;10.1111/bju.16209.
Davis R, Eppler M, Ayo-Ajibola O, et al. Evaluating the effectiveness of artificial intelligence–powered large language models application in disseminating appropriate and readable health information in urology. J Urol. 2023;210(4):688-694.

ARTIFICIAL INTELLIGENCE Shifting From Dr Google to Dr GPT: The Potential Impact on Patient Safety of Changing e-Providers

American Urological Association

About AUANews

Quick Links