A recent study led by Mount Sinai, Rabin Medical Center in Israel, and colleagues found possible medical ethics errors in high-level artificial intelligence (AI) models.
The researchers adjusted some “well known” scenarios regarding medical ethics to test if large language models (LLMs) are trustworthy in medical situations. According to the study published in NPJ Digital Medicine, “LLMs such as ChatGPT-o1, display subtle blind spots in complex reasoning tasks.”
When testing the LLMs, the scientists found that the systems sometimes could not detect that the scenarios were updated and usually reverted to the “well known” answers.
Co-senior corresponding author Girish N. Nadkarni, MD, MPH, Chair of the Windreich Department of Artificial Intelligence and Human Health, Director of the Hasso Plattner Institute for Digital Health, Irene and Dr. Arthur M. Fishberg Professor of Medicine at the Icahn School of Medicine at Mount Sinai, and Chief AI Officer of the Mount Sinai Health System commented in a press release: “Our findings don’t suggest that AI has no place in medical practice, but they do highlight the need for thoughtful human oversight, especially in situations that require ethical sensitivity, nuanced judgment, or emotional intelligence.”