GoogleAI chatbots pass ophthalmology board exams

In a new study published in Eye, researchers investigated the potential of artificial intelligence (AI) chatbots by evaluating their performance on an ophthalmology board certification practice exam.

Give me some background first.

AI chatbots have become a focal point in the eye care industry. In previous studies, ChatGPT scored 46% on a board certification practice test but was ultimately deemed by the researchers to be insufficient for assisting in preparing for board certification.

The researchers in this study aimed to broaden the understanding of the potential of AI in a medical context.

Now, talk about the study.

The researchers utilized 150 text-based multiple-choice questions sourced from Eye Quiz, a platform that contains ophthalmology board certification examination practice questions.

The researchers evaluated Gemini and Bard for:

Accuracy
Response length
Response time
Provision of explanations

Was there a secondary analysis?

Yes! The investigators conducted this analysis using a virtual private network (VPN) to test Bard and Gemini from Vietnam, Brazil, and the Netherlands to see how their performance compared with the U.S. versions.

So what were the U.S. findings?

In the U.S. analysis, Bard and Gemini had 71% accuracy across the 150 questions.

It was noted that:

Bard excelled in orbital and plastic surgery
Gemini was more effective than Bard in general ophthalmology and various sub-specialties

And the secondary analysis?

During the secondary analysis using a VPN, the researchers found that:

Bard received a 67% accuracy when used in Vietnam
- 32 questions (21%) answered differently from the U.S. version
Gemini received a 74% accuracy in Vietnam
- 23 questions (15%) answered differently than the U.S. version
Gemini showed slightly worse performance in the U.S. when used in:
- Brazil (68%)
- The Netherlands (65%)

Expert opinion?

The study authors stated that Gemini and Bard had “an acceptable performance in responding to ophthalmology board examination practice questions.”

They added that, during the investigation, the chatbots provided a confident explanation even when giving an incorrect answer.

Take home.

This study highlights AI tools' current capabilities and their vast potential in eye care.

Future studies could expand upon this research to create an even broader understanding of how AI tools could be utilized.