Big data! It’s all the rage with tweens these days. Hoverboards, Yik Yak, and predictive analytics are all kids talk about now.
This “big data” application, more specifically, involves the use of an institutional database to derive predictors for mortality in sepsis. Many decision instruments for various sepsis syndromes already exist – CART, MEDS, mREMS, CURB-65, to name a few – but all suffer from the same flaw: how reliable can a rule with just a handful of predictors be when applied to the complex heterogeneity of humanity?
Machine-learning applications of predictive analytics attempt to create, essentially, Decision Instruments 2.0. Rather than using linear statistical methods to simply weight a small handful of different predictors, most of these applications utilize the entire data set and some form of clustering. Most generally, these models replace typical variable weighted scoring with, essentially, a weighted neighborhood scheme, in which similarity to other points helps predict outcomes.
Long story short, this study out of Yale utilized 5,278 visits for acute sepsis and a random forest model to create a training set and a validation set. The random forest model included all available data points from the electronic health record, while other models used up to 20 predictors based on expert input and prior literature. For their primary outcome of predicting in-hospital death, the AUC for the random forest model was 0.86 (CI 0.82-0.90), while none of the rest of the models exceeded an AUC of 0.76.
This still simply at the technology demonstration phase, and requires further development to become actionable clinical information. However, I believe models and techniques like this are our next best paradigm in guiding diagnostic and treatment decisions for our heterogenous patient population. Many challenges yet remain, particularly in the realm of data quality, but I am excited to see more teams engaged in development of similar tools.
“Prediction of In-hospital Mortality in Emergency Department Patients with Sepsis: A Local Big Data Driven, Machine Learning Approach”
http://www.ncbi.nlm.nih.gov/pubmed/26679719