Automated diagnostic machines consisting of symptom checklists have been evaluated in medicine before. The results were bleak: symptom-checkers put the correct diagnosis first only 34% of the time, and had the correct diagnosis in the top three only 51% of the time.
However, when these authors published their prior study, they presented these findings in a vaccuum – despite their poor performance, how did this compare against human operators? In this short research letter, then, these authors, compare the symptom-checker performance against clinicians contributing to a sort of crowdsourced medical diagnosis system.
And, at least for awhile longer, the human-machine is superior than the machine-machine. Humans reading the same vignettes placed the correct diagnosis first 72.1% of the time, and in the top three 84.3% of the time.
With time and further natural language processing and machine learning methods, I expect automated diagnosis engines to catch up with humans – but we’re not there yet!
“Comparison of Physician and Computer Diagnostic Accuracy.”
https://www.ncbi.nlm.nih.gov/pubmed/27723877