When EHR Interventions Succeed … and Fail

This is a bit of a fascinating article with a great deal to unpack – and rightly published in a prominent journal.

The brief summary – this is a “pragmatic”, open-label, cluster-randomized trial in which a set of interventions designed to increase guideline-concordant care were rolled out via electronic health record tools. These interventions were further supported by “facilitators”, persons assigned to each practice in the intervention cohort to support uptake of the EHR tools. In this specific study, the underlying disease state was the triad of chronic kidney disease, hypertension, and type II diabetes. Each of these disease states has well-defined pathways for “optimal” therapy and escalation.

The most notable feature of this trial is the simple, negative topline result – rollout of this intervention had no reliably measurable effect on patient-oriented outcomes relating to disease progression or acute clinical deterioration. Delving below the surface provides a number of insights worthy of comment:

  • The authors could have easily made this a positive trial by having the primary outcome as change in guideline-concordant care, as many other trials have done. This is a lovely example of how surrogates for patient-oriented outcomes must always be critically appraised for the strength of their association.
  • The entire concept of this trial is likely passively traumatizing to many clinicians – being bludgeoned by electronic health record reminders and administrative nannying to increase compliance with some sort of “quality” standard. Despite all these investments, alerts, and nagging – patients did no better. As above, since many of these trials simply measure changes in behavior as their endpoints, it likely leaves many clinicians feeling sour seeing results like these where patients are no better off.
  • The care “bundle” and its lack of effect size is notable, although it ought to be noted the patient-oriented outcomes here for these chronic, life-long diseases are quite short-term. The external validity of findings demonstrated in clinical trials frequently falls short when generalized to the “real world”. The scope of the investment here and its lack of patient-oriented improvement is a reminder of the challenges in medicine regarding evidence of sufficient strength to reliably inform practice.

Not an Emergency Medicine article, per se, but certainly describes the sorts of pressures on clinical practice pervasive across specialties.

“Pragmatic Trial of Hospitalization Rate in Chronic Kidney Disease”
https://www.nejm.org/doi/full/10.1056/NEJMoa2311708

The Cost of “Quality”

In case you missed this beautiful little article, it’s worth re-highlighting regarding the paradoxical “cost” of “quality”.

In theory, high-quality care is its own reward. Timely actions and interventions, thoughtful and thorough evaluations, and appropriate guideline adherence when applicable are all goals with reasonable face validity for healthcare delivery. Competing incentives, however, coupled with time pressures, erode some of the natural inclination towards ideal care. Thus, “quality” metrics and goals, created with the best of intentions to nudge clinicians and health systems towards better care.

Unfortunately, the siren song of “quality” has begat a locust horde of metrics from all manner of organizations. Health care expenditures in the U.S. have grown from 9% of GDP to 20% GDP, and administrative costs are estimated to comprise up to 30% of total national health care spending. To add context to these larger estimates, this little article simply looks within their own institution to evaluate the potential contribution of “quality” measures to those larger sums.

The authors identified, by surveying personnel across their institution, 162 quality metrics reported to 7 measuring organizations, totalling 271 reports (as some required reporting to multiple organizations). The bulk (70%) were publicly reported “quality” measures, while another 27% were related to pay-for-performance programs.

Overall, across surveyed personnel, the authors determined approximately 108,000 person-hours were consumed annually on these reports. Based on the annual salaries of the individuals involved and their time commitment, the total annual cost to the institution was estimated at over USD$5 million. The most expensive metrics were those requiring individual chart abstraction, while those metrics requiring merely electronic data capture required a fraction of the cost.

Multiplied by the 4000+ hospitals in the U.S., suddenly we’re obviously talking about tens of billions of dollars of added administrative overhead. Interestingly enough, and relevant to emergency medicine, one of the worst offenders as far as cost is SEP-1 – the CMS sepsis core measure. Not only is this measure onerous and costly to administer on the institutional side, it results in substantial unmeasured additional work for clinical staff – and I suspect many of these “quality” measures have their cost similarly underestimated.

Administrative costs aside, it is as important to consider whether “quality” metrics actually reflect higher-quality care, or whether the changes in care driven by metrics improve value. What is certain, however, is their proliferation has been clearly nightmarish.

“The Volume and Cost of Quality Metric Reporting”
https://jamanetwork.com/journals/jama/article-abstract/2805705

It’s Not OK To Let 25% of tPA Cases Be Stroke Mimics

With all the various competing interests for time, it’s rare to find an article of sufficient note to warrant its own blog post. A notable publication might get a short tweet thread. Collections of other literature find their way into ACEPNow articles or the odd Annals of Emergency Medicine Journal Club. But, every once in awhile, there’s something … else.

This article pertains to the practice of telestroke administration of thrombolysis for acute ischemic stroke. In major hospital centers, there may be in-house neurology hospitalists or stroke and vascular specialists, and the expertise for management of stroke is readily at the bedside. In many community, regional, and rural hospitals, these resources are unavailable – except by telestroke evaluation. These common arrangements allow access to neurology expertise, followed potentially by interhospital transfer.

In this article, the authors review a series of 270 patients receiving intravenous thrombolysis following evaluation via telestroke. Most patients underwent MRI with DWI following transfer to the hub stroke center, while a handful did not – probably those with serious complications arising from stroke, and those with obvious stroke mimic etiologies. Patients otherwise were categorized as a stroke if a lesion was found on MRI with DWI, but could be deemed a TIA or a stroke mimic if no lesion was seen.

Not-so astonishingly, they report 23.7% of their series are stroke mimics. Another ~5% are TIA, another diagnosis for which there is no indication for thrombolysis. While this much collateral damage might horrify some, this sort of blanket use of thrombolytics is routine in the United States, if not encouraged. The proof of such encouragement is evidence in these authors’ Discussion section, with this interpretation of recent guidelines:

In fact, the most recent AHA guidelines in 2019 recognise this and specifically recommend thrombolysis to SM given the low rate of sICH and state that starting IVtPA is preferred over delaying treatment to pursue additional diagnostic studies.

Naturally, the authors go on to propose a threshold of reasonable practice for which their performance fits comfortably within:

In our academic tertiary referral telestroke programme, 23.7% of patients administered thrombolysis had a final diagnosis of SM. We suggest that a reasonable SM thrombolysis rate for telestroke programme should be one in four, similar to the accepted negative appendectomy rate, as that the risk of overtreatment should be accepted over the risk of undertreatment.

This is, of course, nonsensical. Leaving aside their entirely specious comparison to an acceptable negative appendectomy rate, let us ruminate seriously on the response to a poorly performing process being to normalize the poor performance. The authors rightfully cite Jeff Saver’s general musings that, given the advancing state of the specialty, the acceptable stroke mimic rate ought to be around 3%. They then justify their absurdly higher total by noting a small portion – about 7-10% – of eligible strokes are missed for treatment, and it is rather the better practice to simply treat any potential stroke in order not to miss a single one.

Again, this perspective hinges primarily on the concept treating stroke mimics with thrombolysis is “harmless“, owing to a rate of sICH of merely ~0.5-1%. While this is still an unacceptable perspective towards inducing sICH in an otherwise unsuspecting patient, the other harms for thrombolysis in stroke mimics include:

  • Diagnostic inertia, in which evaluation and treatment for the true cause of neurologic dysfunction is delayed.
  • Permanent misdiagnosis, in which a patient treated with thrombolysis, improves, and is labelled an “aborted stroke”. They now carry the diagnosis of prior stroke, making it potentially more difficult to obtain health insurance, not to mention likely unnecessarily being prescribed medications for secondary prevention of stroke.
  • Financial harms from being treated with thrombolysis, which typically requires extended monitoring in a critical care or stroke unit, far exceeding the costs associated with a non-stroke hospitalization.

In short, this is a grossly unacceptable perspective endorsing, frankly, reckless use of thrombolysis. These authors should reconsider the primarily literature they are citing as justification and the framing of their argument, and retract their call to normalize these poorly performing clinical systems.

“Thrombolysis of stroke mimics via telestroke”
https://svn.bmj.com/content/7/3/267

Sepsis Alerts Save Lives!

Not doctors, of course – the alerts.

This is one of those “we had to do it, so we studied it” sorts of evaluations because, as most of us have experienced, the decision to implement the sepsis alerts is not always driven by pent-up clinician demand.

The authors describe this as sort of “natural experiment”, where a phased or stepped roll-out allows for some presumption of control for unmeasured cultural and process confounders limiting pre-/post- studies. In this case, the decision was made to implement the “St John Sepsis Algorithm” developed by Cerner. This algorithm is composed of two alerts – one somewhat SIRS- or inflammation-based for “suspicion of sepsis”, and one with organ dysfunction for “suspicion of severe sepsis”. The “phased” part of the roll-out involved turning on the alerts first in the acute inpatient wards, then the Emergency Department, and then the specialty wards. Prior to being activated, however, the alert algorithm ran “silently” to create the comparison group of those for whom an alert would have been triggered.

The short summary:

  • In their inpatient wards, mortality among patients meeting alert criteria decreased from 6.4% to 5.1%.
  • In their Emergency Department, admitted patients meeting alert criteria were less likely to have a ≥7 day inpatient length-of-stay.
  • In their Emergency Departments, antibiotic administration of patients meeting alert criteria within 1 hour of the alert firing increased from 36.9% to 44.7%.

There are major problems here, of course, both intrinsic to their study design and otherwise. While it is a “multisite” study, there are only two hospitals involved. The “phased” implementation not the typical different-hospitals-at-different-times, but within each hospital. They report inpatient mortality changes without actually reporting any changes in clinician behavior between the pre- and post- phases, i.e., what did clinicians actually do in response to the alerts? Then, they look at timely antibiotic administration, but they do not look at general antibiotic volume or the various unintended consequences potentially associated with this alert. Did admission rates increase? Did percentages of discharged patients receiving intravenous antibiotics increase? Did clostridium difficle infection rates increase?

Absent the funding and infrastructure to better prospectively study these sorts of interventions, these “natural experiments” can be useful evidence. However, these authors do not seem to have taken an expansive enough view of their data with which to fully support an unquestioned conclusion of benefit to the alert intervention.

“Evaluating a digital sepsis alert in a London multisite hospital network: a natural experiment using electronic health record data”

https://academic.oup.com/jamia/advance-article/doi/10.1093/jamia/ocz186/5607431

Let’s Not Treat the Asymptomatic (Urine)

It’s a fairly common picture: the altered/declining/demented/elderly, with a small leukocytosis, and some positive elements on a urinalysis – but no clear symptoms of urinary tract infection. For lack of a better explanation, perhaps, treatment is begun with antibiotics. The benefit is uncertain, but, at the least it is more likely to benefit than harm?

This retrospective study, within the scope of its limitations, finds no reliable benefit to treatment, and more likely harms. This study performed chart reviews on 2,733 hospitalized patients with “asymptomatic bacteriuria”, as defined as a positive urine culture in the absence of documented Infectious Diseases Society of America criteria for UTI. Constitutional or non-specific symptoms in those unable to specifically report (e.g, dementia, AMS) were not considered as consistent with UTI unless multiple systemic signs of infection were also present.

Not only did nearly 80% of patients identified as ASB receive antibiotics, these authors were unable to shed light on any value of treatment. Treatment of ASB was more common in the scenario above, but was also widespread in patients capable of reporting symptoms yet having none documented. The dependence on retrospective chart abstraction limits the accuracy of their observations, but they have face validity.

Patient-oriented outcomes associated with either antibiotic treatment or non-treatment were 30-day mortality, 30-day readmission, 30-day post-discharge Emergency Department visit, C. diff infection, and duration of hospitalization. Most adjusted and unadjusted odds ratios for poorer outcomes were associated with treating ASB, but these differences were generally not statistically significant. Duration of hospitalization, however, was statistically associated with antibiotic treatment. This may be a spurious finding relating to contextual clinical confounders, but it may also represent an element of diagnostic inertia distracting from the true underlying etiology relating to hospitalization.

Regardless, consistent with this journal’s series feature “Less is More”, yet another instance in which common practice does not easily lend itself to confirmation of value.

“Risk Factors and Outcomes Associated With Treatment of Asymptomatic Bacteriuria in Hospitalized Patients”

https://jamanetwork.com/journals/jamainternalmedicine/fullarticle/2748454

Vital Signs = Vital

That is how the authors frame it, after all: “‘Vital signs are vital’ is a common refrain in emergency medicine.”

And, these authors add to the body of work further exploring this axiom. In this simple, retrospective data analysis, they evaluate all adult visits to their Emergency Department to determine the effect of abnormal vital signs at disposition on short-term outcomes.

For discharges, about 3% of their cohort returned to the same ED within 72 hours. Only a handful – a little less than 15% – had any vital sign abnormalities at discharge. And, yes, those with vital sign abnormalities were slightly more likely to return than those who did not, with relative risk ratios centered generally around 1.2. Then, a little more than a quarter of patients were admitted on their return visit – and, again, vital sign abnormalities increased the likelihood of subsequent admission by a small amount. In this case, fever was more likely than the other abnormal vital signs to tip the scales towards admission.

Similarly, an analysis of inpatient visits and subsequent escalations in care noted vital sign abnormalities exhibited a greater risk of upgrade, with RRs centered around 2.

Overall, however, the vast majority of patients who were either admitted or discharged with abnormal vital signs did well. Abnormal vital signs are always worth recognizing and dedicating a bit of cognitive effort, but the aren’t strong enough predictors of subsequent outcomes to drive changes in management.

“Association of Vital Signs and Process Outcomes in Emergency Department Patients”
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6526877/

The CT and Syncope

The Choosing Wisely Campaign lists non-contrast CT of the head as one of their low-value procedures for low-risk patients presenting with syncope. However, despite these recommendations, these authors voice concerns up to two-thirds of patients still undergo advanced imaging.

In this systematic review looking at both practice patterns and yield, the authors identify 17 studies of both hospitalized and Emergency Department patients addressing this topic. Pooling together 1,669 ED patients, 55% underwent CT with a yield of 3.8%. Pooling 1,289 hospitalized patients, CTs were performed 45% of the time with a yield of 1.2%. Considering the general morbidity and mortality associated with intracranial conditions, these are not fantastic yield numbers, but are not entirely unreasonable.

There is a bit of trouble with these numbers, however: even though their systematic review went up to 2017, most of the included studies were published before 2011. Even their citation of “two-thirds receiving head CTs” was published in 2009, well before the 2014 Choosing Wisely statement. Then, the bulk of the included studies were retrospective, blurring the reliability of their inclusion and outcomes measurement.

I think these data probably effectively illustrate the rarity of serious outcomes in syncope, but provide little insight into the current scope of the problem with respect to overuse. An intracranial process or intracranial injury associated with syncope ought to be considered in each case, but better data describing features predictive of underlying intracranial injury is needed to better separate high-risk from low-risk.

“The Yield of Computed Tomography of the Head Among Patients Presenting With Syncope: A Systematic Review”
https://www.ncbi.nlm.nih.gov/pubmed/31006937

Wisdom of the Crowds

This is a fun little article presenting data relating to the Human Diagnosis Project, an online medical platform in which medical students and physicians create and solve teaching cases. Cases can be created by anyone, and are “solved” by submitting a ranked differential diagnosis to the system. Approximately 14,000 users have created or solved 230,000 cases in the few years it has been operational.

The article here, generally, highlights the diagnostic accuracy of respondents for 1,572 cases with 10 or more solve attempts. In their analysis, diagnostic performance, as measured by the likelihood for including the correct diagnosis in their top three, increased as additional physicians were added to the mix – effectively from 60-70% diagnostic accuracy up to a ceiling of about 90% when the collective diagnoses from 9 physicians were pooled.

While there are obvious limitations to using this platform to fully evaluate diagnostic performance and pooled diagnostic performance, my other takeaway: regardless of the actual number, even with the combined intelligence of multiple clinicians, accuracy is never 100%. While the expectation of our patients (and medicolegal systems) is perfect performance, it is not reasonable to expect perfection.

“Comparative Accuracy of Diagnosis by Collective Intelligence
of Multiple Physicians vs Individual Physicians”

https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2726709

Computer Says: Discharge that Pulmonary Embolism!

We’ve learned a couple important things about pulmonary emboli for the past five or so years. First, we diagnose too many of them. Second, all pulmonary emboli do not need to be hospitalized. Knowing, as they say, is half the battle. That’s a start – but it’s not enough.

This study involves important thing number two above, the hospitalization of PE. Kaiser Permanente, in its endless quest for value, has already published several studies demonstrating the safety of discharging patients with PE. However, hidden in the descriptive statistics from those studies are the unfortunate still-low percentages of patients discharged.

In this prospective, multi-center, “convenience-assigned” trial, a computerized decision-support tool was rolled out to support risk-stratification for patients diagnosed with PE. Based on the pulmonary embolism severity index (PESI), patients scoring in class I or II were encouraged to be discharged, while those with higher scores were nudged towards hospitalization. In their pre-post design, little change occurred at the control hospitals, while the percentage of patients with PE discharged from the intervention hospitals jumped from 17.4% to 28.0%. No issues regarding untoward 5-day recidivism or 30-day adverse events were detected.

This is a great step forwards, and, frankly, one of the most prominent examples of decision-support being actually useful to implement practice change.   That said, in the intervention hospitals, there were “physician champions” associated with the roll-out of the CDS intervention, which almost certainly increased update.  Then, 41.2% of patients were PESI class I or II, so there’s even further room for improvement above these topline results – but this is an at least solid effort.

“Increasing Safe Outpatient Management of Emergency Department Patients With Pulmonary Embolism”

http://annals.org/aim/article-abstract/2714293/increasing-safe-outpatient-management-emergency-department-patients-pulmonary-embolism-controlled

Clinical Policy: Sanity Returns to ACS

This may be the most important recent sentence in modern emergency medicine:

“… based on limitations in diagnostic technology and the need to avoid the harms associated with false-positive test results, the committee based its recommendations on the assumption that the majority of patients and providers would agree that a missed diagnosis rate of 1% to 2% for 30-day MACE in NSTE ACS is acceptable.”

It’s no longer the domain of rogue podcasters and throwaway magazine editorialists to declare our zero-miss culture destructive and self-defeating – it’s finally spelled out in black & white by our speciality society. This is not a license to kill, of course, but it is now utterly reasonable to feel as though the wind is at your back when sending an appropriately-evaluated patient home.

This clinical policy statement does not address terribly many questions, but it does jam a lot of evidence into one document in their review. Specifically, these authors ask:

1. In adult patients without evidence of ST-elevation ACS, can initial risk stratification be used to predict a low rate of 30-day MACE?

In short, yes. These authors recommend HEART as their decision instrument du jour, but also acknowledge other scores that simply do not yet have enough diverse evidence to support their use. Interestingly, they also note clinical gestalt may be just as good as any decision instrument, at least when the ECG and troponin are negative for new ischemia. Again, more prospective evidence would be required to formally enshrine such a recommendation into a clinical policy statement.

2. In adult patients with suspected acute NSTE ACS, can troponin testing within 3 hours of ED presentation be used to predict a low rate of 30-day MACE?

Here the authors have only Level C recommendations, which means their recommendations are based on low levels of evidence. Overall, they are weakly in favor of using of high-sensitivity troponins alone, or repeat conventional troponin testing as part of a risk-stratification or accelerated diagnostic pathway.

3. In adult patients with suspected NSTE ACS in whom acute MI has been excluded, does further diagnostic testing (eg, provocative, stress test, computed tomography [CT] angiography) for ACS prior to discharge reduce 30-day MACE?

Please no: “Do not routinely use further diagnostic testing (coronary CT angiography, stress testing, myocardial perfusion imaging) prior to discharge in low-risk patients in whom acute MI has been ruled out to reduce 30-day MACE.”  Take that, CCTA proponents.  They give an expert consensus recommendation of 1 to 2 week primary care follow-up when feasible, or consideration of observation when no follow-up is possible.

The fourth question posed deals with use of P2Y12 and
glycoprotein IIb/IIIa inhibitors in the ED, and is met basically with a shrug.

So!  Go forth and provide good medical care – specifically, high-value medical care, further freed from the mental oubliette of zero-miss.

“Clinical Policy: Critical Issues in the Evaluation and Management of Emergency Department Patients With Suspected Non–ST-Elevation Acute Coronary Syndromes”
https://www.ncbi.nlm.nih.gov/pubmed/30342745