The Elephant in the PECARN/CHALICE/CATCH Room

A few months ago, I wrote about the main publication from this study group – a publication in The Lancet detailing a robust performance comparison between the major pediatric head injury decision instruments. Reading between the lines, as I mentioned then, it seemed as though the important unaddressed result was how well physician judgment performed – only 8.3% of the entire cohort underwent CT.

This, then, is the follow-up publication in Annals of Emergency Medicine focusing on the superiority of physician judgment. Just to recap, this study assessed 18,913 patients assessed to have had a mild head injury. Of these, 160 had a clinically important traumatic brain injury and 24 underwent neurosurgery. The diagnostic performance of these decision instruments is better detailed in the other article but, briefly, for ciTBI:

  • PECARN – ~99% sensitive, 52 to 59.1% specific
  • CHALICE – 92.5% sensitive, 78.6% specific
  • CATCH – 92.5% sensitive, 70.4% specific

These rules, given their specificity, would commit patients to CT scan rates of 20-30% in the case of CHALICE and CATCH, and then an observation or CT rate of ~40% for PECARN. But how did physician judgment perform?

  • Physicians – 98.8% sensitive, 92.4% specific

Which is to say, physicians missed two injuries – each detected a week later in follow-up for persistent headaches – but only performed CTs in 8.3% of the population. As I highlighted in this past month’s ACEPNow, clinical decision instruments are frequently placed on a pedestal based on their own performance characteristics in a vacuum, and rarely compared with clinician judgment – and, frequently, clinician judgment is as good or better. It’s fair to say these head injury decision instruments, depending on the prevalence of injury and the background level of advance imaging, may actually be of little value.

“Accuracy of Clinician Practice Compared With Three Head Injury Decision Rules in Children: A Prospective Cohort Study”
http://www.annemergmed.com/article/S0196-0644(18)30028-3/fulltext

On Anesthesiology Knows Sedation

“These guidelines are intended for use by all providers who perform moderate procedural sedation and analgesia in any inpatient or outpatient setting …”

That is to say, effectively by fiat, if you perform procedural sedation, these guidelines apply to YOU.

This is a publication by the American Society of Anesthesiologists, and sponsored by various dental and radiology organizations. This replaces a 2012 version of this document – and it has changed for both better and worse.

Falling into the “better” column of this document, this guideline no longer perpetuates the myth of requiring a period of fasting prior to an urgent or emergent procedure. Their new recommendation:

“In urgent or emergent situations where complete gastric emptying is not possible, do not delay moderate procedural sedation based on fasting time alone”

However, some things are definitely “worse”. By far the largest problem with these guidelines – reflecting the exclusion of emergency medicine and critical care specialties from the writing or approving group – is their classification of propofol and ketamine as agents intended for general anesthesia. They specifically differentiate practice with these agents from the use of benzodiazepines or adjunctive opiates by stating:

“When moderate procedural sedation with sedative/ analgesic medications intended for general anesthesia by any route is intended, provide care consistent with that required for general anesthesia.”

These guidelines do not describe the care of patients receiving general anesthesia, but, obviously, we are not performing general anesthesia in the Emergency Department – and, I expect most hospitals do not credential their Emergency Physicians for general anesthesia. The impact of these guidelines in a practical sense on individual health system policy is unclear, particularly in the context of safe use of these medications by EPs for decades, but it’s certainly just one more pretentious obstacle to providing safe and effective care for our patients.

“Practice Guidelines for Moderate Procedural Sedation and Analgesia 2018”

http://anesthesiology.pubs.asahq.org/article.aspx?articleid=2670190

“The Newest Threat to Emergency Department Procedural Sedation”

https://www.ncbi.nlm.nih.gov/pubmed/29429580

Using PERC & Sending Home Pulmonary Emboli For Fun and Profit

The Pulmonary Embolism Rule-Out Criteria have been both lauded and maligned, depending on which day the literature is perused. There are case reports of large emboli in patients who are PERC-negative, as well as reports of PE prevalence as high as 5% – in contrast to its derivation meeting the stated point of equipoise at <1.8%. So, the goal here is to be the prospective trial to end all trials and most accurately describe the impact of PERC on practice and outcomes.

This is a cluster-randomized trial across 14 Emergency Departments across France.  Centers were randomized to either a PERC-based work-up strategy for PE, or “conventional” in which virtually every patient considered for PE was tested using D-dimer. Interestingly, these 14 centers also crossed-over to the alternative algorithm approximately halfway through the study period, so every ED was exposed to both interventions – some of which used PERC first, and vice versa.

Overall, they recruited 1,916 patients across the two enrollment periods, and these authors focused on the 1,749 who received per-protocol testing and were not lost to follow-up. The primary outcome was any new diagnosis of venous thromboembolism at 3 month follow-up.  This was their measure of, essentially, clinically important missed VTE upon exiting their algorithm. The headline results here were, in their per-protocol population, that 1 patient was diagnosed with VTE in follow-up in the PERC group compared with none in the control cohort. This met their criteria for non-inferiority, and, just at face value, the PERC-based strategy is clearly reasonable. There were 48 patients lost to follow-up, however, but given the overall prevalence of PE in this population, it is unlikely these lost patients would have affected the overall results.

There are a few interesting bits to work through from the characteristics of the study cohort. The vast majority of patients considered for the diagnosis of PE were “low risk” by either Wells or simplified Revised Geneva Score. However, 91% of those in the PERC cohorts were “low risk”, as compared to 78% in the control cohort – which, considering the structure of this trial, seems unlikely to have occurred by chance alone. In the PERC cohort, about half failed to meet PERC and these patients – plus a few protocol violations – moved forward with D-dimer testing. In the conventional cohort, 99% were tested with D-dimer in accordance with their algorithm.

There were then, again, more odd descriptive results at this point.  The results of the D-dimer testing (≥0.5 µg/mL) were positive in 343 of the PERC cohort and 471 of the controls. However, physicians only moved forward with CTPA in 38% of the PERC cohort and 46% of the conventional cohort.  It is left entirely unaddressed why patients entered a PE rule-out pathway and ultimately never received a definitive imaging test after a D-dimer above threshold. For what it’s worth, then, the fewer patients undergoing evaluation for PE in the PERC cohort led to fewer diagnoses of PE, fewer downstream hospital admissions and anticoagulants, and their ED length of stay was shorter. The absolute numbers are small, but patients in the control cohort undergoing CTPA were more likely to have subsegmental PEs (5 vs. 1), which, again, ought to generally make sense.

So, finally, what is the takeaway here? Should you use a PERC-based strategy? As usual, the answer is: it depends. Firstly, it is almost certainly the case the PERC-based algorithm is safe to use. Then, if your current approach is to carpet bomb everyone with D-dimer and act upon it, yes, you may see dramatic improvements in ED processes and resource utilization. However, as we see here, the prevalence of PE is so low, strict adherence to a PERC-based algorithm is still too clinically conservative. Many elevated D-dimers did not undergo CTPA in this study – and, with three month follow-up, they obviously did fine. Frankly, given the shifting gestalt relating to the work-up of PE, the best cut-off is probably not PERC, but simply stopping the work-up of most patients not intermediate- or high-risk.

“Effect of the Pulmonary Embolism Rule-Out Criteria on Subsequent Thromboembolic Events Among Low-Risk Emergency Department Patients: The PROPER Randomized Clinical Trial”
https://jamanetwork.com/journals/jama/fullarticle/2672630

EDACS vs. HEART – But Why?

The world has been obsessed over the past few years with the novelty of clinical decision rules for the early discharge of chest pain. After several years of battering the repurposed Thrombolysis in Myocardial Infarction (TIMI) score, History, Electrocardiogram, Age, Risk factors and Troponin (HEART) became ascendant, but there are several other candidates out there.

One of these is Emergency Department Assessment of Chest pain Score (EDACS), which is less well-known, but has reasonable face validity.  It does a good job identifying a “low-risk” cohort, but is more complicated than HEART. There is also a simplified version of EDACS that goes ahead and eliminates some of the complicated subtractive elements of the score. This study pits these various scores head-to-head in the context of conventional troponin testing, as well.

This is a retrospective review of 118,822 patients presenting to Kaiser Northern California Emergency Departments, narrowing the cohort to those whose initial Emergency Department evaluation was negative for acute coronary syndrome. The 60-day MACE (composite of myocardial infarction, cardiogenic shock, cardiac arrest, and all-cause mortality) in this cohort was 1.9%, most of which were acute MI. Interestingly, these authors chose to present only the negative predictive value of their test characteristics, which means – considering such low prevalence – the ultimate rate of MACE in all the low-risk cohorts defined by each decision instrument were virtually identical. Negative predictive values of all three scores depended primarily on the troponin cut-off used, and were ~99.2% for ≤0.04 ng/mL, and ~99.5% for ≤0.02 ng/mL. The largest low-risk cohort by definition was with the original EDACS rule, exceeding the HEART score classification by an absolute quantity of about 10% of the total cohort, regardless of the troponin cut-off used.

The editorial accompanying the article goes on to laud these data as supporting the use of these tools for early discharge from the Emergency Department. However, this is an outdated viewpoint, particularly considering the data showing early non-invasive evaluations are of uncertain value. In reality, virtually all patients who have been ruled-out for ACS in the ED can be discharged home, regardless of risk of MACE. The value of these scores is probably less so in determining who can be discharged, but rather in helping triage patients for closer primary care or specialist follow-up.  Then, individualized plans can be developed for optimal medical management, or for assessment of the adequacy of the coronary circulation, to prevent what MACE is feasible to be prevented.

“Performance of Coronary Risk Scores Among Patients With Chest Pain in the Emergency Department”
http://www.onlinejacc.org/content/71/6/606

“Evaluating Chest Pain in the Emergency Department: Searching for the Optimal Gatekeeper.”
http://www.onlinejacc.org/content/71/6/617

The qSOFA Story So Far

What do you do when another authorship group performs the exact same meta-analysis and systematic review you’ve been working on – and publishes first? Well, there really isn’t much choice – applaud their great work and learn from the experience.

This is primarily an evaluation of the quick Sequential Organ Failure Assessment, with a little of the old Systemic Inflammatory Response Syndrome thrown in for contextual comparison. These studies included those in the Intensive Care Unit, hospital wards, and Emergency Departments. Their primary outcome was mortality, reported in these studies mostly as in-hospital mortality, but also 28-day and 30-day mortality.

The quick synopsis of their results, pooling 38 studies and 383,333 patients, mostly from retrospective studies, and mostly from ICU cohorts:

  • qSOFA is not terribly sensitive, particularly in the settings in which it is most relevant. Their reported overall sensitivity of 60.8% is inflated by its performance in ICU patients, and in ED patients sensitivity is only 46.7%.
  • Specificity is OK, at 72.0% overall and 81.3% in the ED. However, the incidence of mortality from sepsis is usually low enough in a general ED population the positive predictive value will be fairly weak.
  • In their comparative cohort for SIRS, which is frankly probably irrelevant because SIRS is already well-described, the expected results of higher sensitivity and lower specificity were observed.

Their general conclusion, to which I generally agree, is qSOFA is not an appropriate general screening tool. They did not add much from a further editorial standpoint – so, rather than let our own draft manuscript for this same meta-analysis and systematic review languish unseen, here is an abridged version of the Discussion section of our manuscript written by myself, Rory Spiegel, and Jeremy Faust:

This analysis demonstrates qualitatively similar findings as those observed in the original derivation study performed by Seymour et al. We find our pooled AUC, however, to be lower than the 0.81 reported in their derivation and validation cohort, as well as the 0.78 reported in two external validation cohorts. The meaning of this difference is difficult to interpret, as the clinical utility of this instrument is derived from its use as a binary cut-off, rather than an ordinal AUC. Our sensitivity and specificity from our primary analysis, respectively, compare favorably to their reported 55% and 84%. We also found qSOFA’s predictive capabilities remained robust when exposed to our sensitivity analyses. When only studies at low risk for bias were included, qSOFA’s performance improved.

While our evaluation of SIRS is limited by restricting the comparison solely to those studies which contemporaneously reported qSOFA, our results are broadly consistent with results previously reported. The SIRS criteria at the commonly used cut-off benefits from superior sensitivity for mortality in those with suspected infection, while its specificity is clearly lacking due to its impaired capability to distinguish between clinically important immune system dysregulation and normal host responses to physiologic stress. The important discussion, therefore, is whether and how to incorporate each of these tools – and others, such as the Modified Early Warning Score or National Early Warning Score – into clinical practice, guidelines, and quality measures.

The current approach to sepsis revolves around the perceived significant morbidity and mortality associated with under-recognized sepsis, favoring screening tools whose purpose is minimizing missed diagnoses. Current sepsis algorithms typically rely upon SIRS, depending on its maximal catchment at the expense of over-triage. Such maximal catchment almost certainly represents a low-value approach to sepsis, considering the in-hospital mortality of patients in our cohort with ≥2 SIRS criteria is not meaningfully different than the overall mortality of the entire cohort. The subsequent fundamental question, however, is whether qSOFA and its role in the new sepsis definitions provides a structure for improvement.

Using qSOFA as designed with its cut-off of ≥2, it should be clear its sensitivity does not support its use as an early screening tool, despite its simplicity and exclusion of laboratory measures. However, in a cohort with suspected infection and some physiologic manifestations of sepsis, e.g., SIRS, the true value of qSOFA may be in prioritizing a subgroup for early clinical evaluation. In a healthcare system with unlimited resources, it may be feasible to give each patient uncompromising evaluation and care. Absent that, we must hew towards an idealized approach, where our resources are directed towards those highest-yield patients for whom time-sensitive interventions modify downstream outcomes.

Less discussed are the direct, patient-oriented harms resulting from falsely-positive screening tools and over-enrollment into sepsis bundles. Recent data suggests benefits from shorter time-to-antibiotics administration intervals are realized primarily in critically ill patients. As such, utilization of overly sensitive tools, such as the SIRS criteria, would lead to over-triage and over-treatment, leading to potential iatrogenic harms in excess of net benefits. These harms include effects on individual and community patterns of antibiotic resistance, as exposure to broad-spectrum antibiotics leads to induction of extended-spectrum beta-lactamase resistance in gram-negative pathogens or vancomycin- and carbapenem-resistance in enterococci. Unnecessary antibiotic exposures lead to excess cases of C. difficile infections. The aggressive fluid resuscitation mandated by sepsis bundles leads to metabolic derangement and potential respiratory impairment. Further research should assess the extent of these harms, and in what measure they counterbalance those benefiting from time-sensitive interventions.

This meta-analysis has several limitations. First, we were limited by the relative dearth of high quality prospective data; most of the studies included in our analysis were retrospective. Second, we restricted our prognostic analyses to mortality alone, rather than diagnosis of sepsis. We chose to analyze only mortality because of competing sepsis definitions among expert bodies and government-issued guidelines. Among them, however, mortality is a common feature, the most objective metric, and manifestly the most important patient-centered outcome. Our analysis would not capture other important sequelae of sepsis, including amputation, loss of neurologic and/or independent function, chronic pain, and prolonged psychiatric effects of substantial critical illness. Third, we do not know whether patients included in these studies were septic on presentation, or developed sepsis later in their hospitalization. This may degrade the accuracy assessment of both SIRS and qSOFA. Fourth, while we know that qSOFA alone may miss some cases of sepsis that SIRS might detect, we do not know how many would, in reality, have been deprived of antibiotics and other necessary treatments. In other words, the fate of “qSOFA negative” patients who were evaluated and treated by physicians qualified to detect and treat critical illness via clinical acumen is not known; nor it should not be presumed that all such patients would have necessarily been deprived of timely treatment. Our analysis and comparison of SIRS is definitively incomplete, and not the most reliable estimate of its diagnostic characteristics, but provided for incidental comparison.

The prudent clinical role for qSOFA, however, is as yet undefined, and these data do not offer insight regarding its superiority to clinician judgment for determining a cohort at greatest risk for poor outcomes. Compared with SIRS, at least, those patients identified by qSOFA likely better represent the subset of patients for whom aggressive early treatment confers a particular advantage, and may drive high-value care in the sepsis arena. Future research should assist clinicians in further individualizing initial treatment of sepsis for those stratified to differing levels of risk for poor outcome, as well as to account for the iatrogenic harms and system costs.

“Prognostic Accuracy of the Quick Sequential Organ Failure Assessment
for Mortality in Patients With Suspected Infection: A Systematic Review and Meta-analysis”
http://annals.org/aim/fullarticle/2671919/prognostic-accuracy-quick-sequential-organ-failure-assessment-mortality-patients-suspected

When Seizures Return

This one isn’t precisely hot-off-the press, but, in having just discovered it, it’s hot to me!

This study aims to inform the guidance we provide to families after a child presents with a first-time, unprovoked seizure. Interestingly enough, the data for this analysis is dredged back up from a prospective cohort study from 2005 to 2007, in which patients with first-time seizures were being evaluated for abnormal neuroimaging. However, following discharge from the hospital or Emergency Department, patients also received short- and long-term telephone follow-up.

There were 475 patients enrolled in the original study, and differing numbers were appropriate for inclusion at their various timeframes of follow-up, depending on whether anti-epileptic therapy was started, or whether follow-up could be obtained. All told, seizure recurrence rates were:

  • 48 hours – 21/38 (5.4%)
  • 14 days – 51/359 (14.2%)
  • 4 months – 102/335 (30.4%)

These are extremely non-trivial numbers, and they surprised me. Risk facotrs associated with increased seizure incidence were recurrent seizures at initial presentation, younger age (<3 years), and presence of focal neurologic findings on initial examination. Regardless, however, even absent any of these predictors, the incidence of subsequent seizure is certainly high enough parents should be counseled they ought arrange for prompt neurology evaluation in follow-up.

“Early Recurrence of First Unprovoked Seizures in Children”

https://www.ncbi.nlm.nih.gov/pubmed/29105207