Frank Mazza
Schulich School of Medicine & Dentistry
Timely and accurate diagnosis is essential for neurosurgical outcomes. Despite this, wait times for neurosurgeons in
Southwestern Ontario are the longest among all subspecialties and are expected to further increase [1]. AI-driven decision-support tools like ChatGPT could play a critical role in improving case prioritization, enhancing triage efficiency and patient outcomes. ChatGPT’s use in healthcare is expanding, especially in medical triaging workflows. ChatGPT has passed the American Board of Neurological Surgery examination, accurately interpreted radiological reports, assisted in neurosurgical spine triaging, and supported adjuvant therapy decisions. Although one study evaluated its performance on 20 neuro‐oncology cases, systematic evaluation across diverse neurosurgical presentations remains limited. Given the high volume of neurosurgical cases at Windsor Regional Hospital (WRH), it is vital to assess emerging technologies that can enhance triage accuracy. This study aims to comprehensively evaluate ChatGPT’s performance in identifying case urgency and generating differential diagnoses using a large, representative patient sample from WRH’s urgent neurosurgical clinic (n = 200). We will first determine whether ChatGPT accurately prioritizes case urgency using standard patient vignettes and triaging criteria. Next, we will compare how ChatGPT’s initial investigations and differential diagnoses correspond with those of neurosurgeons. Finally, we will examine how performance varies with case characteristics (location, underlying cause) as well as patient demographics (age, sex) to pinpoint areas for improvement for implementation into WRH triage workflows. These findings will guide the integration of AI-driven tools to optimize resource allocation and ultimately enhance neurosurgical diagnosis and treatment at WRH.
This work will identify if ChatGPT may be especially useful in high-volume emergency neurosurgical settings, as suggested by other investigations. We expect ChatGPT performance to reduce but to remain high (>80%) with more complex cases, such as those with multiple comorbidities or non-specific symptoms while still maintaining relevant initial tests for diagnosis. Ultimately, these findings will serve to provide a thorough evaluation of AI-assisted tools for neurosurgical workflows, thus guiding the integration of AI-driven tools to optimize resource allocation and ultimately enhance neurosurgical diagnosis and treatment at WRH.
