EAN 2025 | A study comparing the accuracy of Chat-GPT-4o versus neurologists in diagnosing polyneuropathy

So in general, neuropathy is a very common condition that affects up to 1% of the general population and increases with age, and they are very frequently also complex conditions in which diagnosis and management are frequently referred to specialized clinicians. For this reason, it is often, especially in rural or underserved areas, a high rate of misdiagnosing and delay, and this is a current issue in our everyday clinical practice as peripheral neuro or peripheral neuro disease specialists...

So in general, neuropathy is a very common condition that affects up to 1% of the general population and increases with age, and they are very frequently also complex conditions in which diagnosis and management are frequently referred to specialized clinicians. For this reason, it is often, especially in rural or underserved areas, a high rate of misdiagnosing and delay, and this is a current issue in our everyday clinical practice as peripheral neuro or peripheral neuro disease specialists. At the same time, of course, AI, we all know, has undergone a tremendous change; they’re increasingly available, increasingly powerful, increasingly able to show clinical reasoning in their responses. So the aim of our study was to assess the efficacy of these tools to help clinicians, specialized and non-specialized in peripheral neuropathies, in evaluating real-life cases. In order to do so, what we did, we collected 100 clinical cases from real-life tertiary care hospitals and we constructed text-based clinical summaries in a standardized fashion, and for each case, we prompted Chat GPT first. We used a specialized prompting technique, which is called the zero-shot chain of thoughts, which is known to increase the reasoning abilities of Chat GPT, and we recorded the output of GPT as leading diagnosis, differential diagnosis, and possible confirmatory tests for management of the neuropathy. At the same time, we enrolled a panel of 36 neurologists from 10 different countries, who reviewed the same clinical cases and could provide comparable outputs. This was designed, and they could also review the GPT output after giving their first diagnostic intention, and then they could review their diagnosis based on their revision of the diagnostic output of Chat GPT. Well, what we found, which was very interesting, so first, Chat GPT is less powerful, both in terms of diagnostic capabilities and management capabilities, compared to specialists. So specialists still hold the best performance metrics. What we observed, which is very interesting, is that non-specialized clinicians actually have a performance in terms of differential diagnosis accuracy and confirmatory test accuracy, which is lower than Chat GPT. And also, they reviewed their diagnosis in 21% of the cases after evaluating GPT output and increased their performance with the GPT help. We also performed an error analysis, we saw that the main mistakes of GPT are due to flawed deduction and some hallucination also. And so our general conclusion is that this tool may actually already be efficient in helping us to bridge the gap in knowledge between non-specialized doctors and specialized doctors in order to hopefully streamline the diagnostic process. So a non-specialized clinician faced with a complex case of neuropathy may seek the help of a large language model and Chat GPT in order to be provided with differentials with possible management possibilities to increase its performance and hopefully get to the diagnosis faster with the help of the specialist, of course, and in this way improve patient outcomes. We think it’s a very interesting study; it’s the first time this has been recorded in this kind of disease, and maybe also a model for other more complex diseases from other fields we think.

This transcript is AI-generated. While we strive for accuracy, please verify this copy with the video.

EAN 2025 | A study comparing the accuracy of Chat-GPT-4o versus neurologists in diagnosing polyneuropathy

Transcript

Related Videos

EAN 2025 | A study comparing the accuracy of Chat-GPT-4o versus neurologists in diagnosing polyneuropathy

Transcript

More from Alberto De Lorenzo

Related Videos

Cookie settings