Big Data may not always be better in the clinical lab

Feb. 22, 2018

The last few years we’ve been listening to the trumpeting of the efficacy of Big Data1,2 and the role it will play in the healthcare industry and the clinical laboratory from now on. Just this morning I received an email from a vendor, and I quote “…I’ve seen how many customers struggle with data—or the lack of data, causing the inability to make good decisions in the laboratory.” If we only had enough data and proper analysis, the mantra goes, we could become more effective and practice better healthcare. The mad rush to connect all our data sources and feed them into a common well seems unstoppable. And data analytics businesses are popping up all around us, promising to give meaning to our reams of untapped data, as if a vast gold field has been recently discovered and is waiting to be mined.

There is unquestionable merit in this view, and indeed, sophisticated use of informatics will be part of the future of healthcare. But as we run cables, connecting us like a well-groomed spider’s web, the old adage still applies: “garbage in, garbage out.” Ultimately, the quality of the data is far more important than the quantity. At a certain threshold, having more data only adds marginally to its meaning, while every data point must be scrutinized for the integrity of its value. Regarding laboratories, every lab director needs to be able to unhesitatingly testify to the integrity of his or her test results. No lab wants to be questioned regarding the quality of its product.

Case study: troponin testing in ER versus laboratory

We are a small independent hospital in rural Colorado. Our ER providers decided they wanted to do point-of-care troponins instead of having our laboratory perform the test. The fact that the laboratory was literally across the hall from the ER didn’t seem to dissuade them from that idea. A feasibility analysis was done. Part of this analysis, and one of the most important aspects of the study, was to determine accurately the laboratory turnaround times (TAT) for troponin testing. If this were a study being done involving a large hospital system, the laboratory data would be aggregated from the various hospitals, a statistical analysis would be performed, and a decision would be made.

Being a small hospital, we dug through a year’s worth of our data repository and our laboratory information systems (LIS) databases for the various parameters needed to determine our TAT. The primary parameters used were the specimens’ receive times in the laboratory to the times they were verified in the EMR. This would seem to be a very straightforward and uncomplicated study. Yet, after viewing the data, we noticed some significant problems.

Knowing your processes

The most telling irregularity was the large standard deviations in our dataset. This finding definitely raised our suspicion that something wasn’t right. Upon closer examination of the data, we discovered three outlier groups—data that shouldn’t have been included in the study:

  1. specimens that had troponins added on at a later time (add-on test),
  2. specimens reviewed for quality assurance reasons, and
  3. specimens with delayed receive times.

It was not uncommon for an ER provider to add on a troponin after running other laboratory tests. These add-on troponins were ordered and performed anywhere from a few minutes to hours later, and because they were run on an existing specimen, the receive time of the specimen was not changed—resulting in an inaccurately longer TAT.

We do quality assurance on all our troponins and have found they are not always correctly documented. We edit these records, and while we don’t change the troponin result, when re-filing the test the verified time is reset and updated in the EMR. In addition, the quality assurance review may not be done until a few days later, thus greatly affecting the standard deviation and the average TAT.

The largest factor that influenced our study involved delayed receive times. Often, after a patient is drawn, the specimen is taken directly to the chemistry area and spun in a stat centrifuge. The phlebotomist then enters the specimen/receive time information in the LIS computer—that is, if he or she doesn’t have other stat draws to finish first. By the time the troponin specimen is received into the LIS, several minutes may have gone by and the specimen is already running on the analyzer. The minimum amount of time a troponin specimen can be “spun and run” in our laboratory is 24 minutes: five minutes to spin and 19 minutes to put on the analyzer and then run and file. We found that in several cases (eight percent) specimens were being verified in less than 24 minutes. While our process still gave the provider the result in the shortest amount of time possible, it did not always record the receive time accurately, thus resulting in a shorter and inaccurately documented TAT. We noticed this especially on the evening-night shifts, when there is less staff available (68 percent vs. 32 percent on day shift).

Getting to the point

If a large hospital corporation, data analytics group, or accountable care organization were to extract this data from various laboratories, it could come to a very different conclusion that would not represent the true TAT. In our study, eight percent would not be considered a huge number, but the effect of not excluding this data in calculating the average TAT was significant; 31 vs. 23 minutes, or about a 26 percent difference. This value doesn’t take into consideration the inaccuracies introduced with the other outlier groups. A data analyst, even one with prior laboratory experience, could not be expected to be familiar with the intricacy of each laboratory’s processes in order to take these and any other variables into consideration. Indeed, why would an analyst even suspect there might be any issues with the data?

While Big Data will offer many significant and cost-saving insights into the practice of healthcare in the coming years, there are some decisions that still need to be made at the local level by those who understand best the details, and the strengths and weaknesses, of their internal processes. Determining which processes best benefit from Big Data analysis, and which do not, is critical to making valued decisions that benefit the patients who come to us for their healthcare. Failure to do so will lead to undesirable outcomes in terms of efficiency, economy, and patient well-being. When this happens, we do a disservice to the very processes designed to improve healthcare and ultimately the health of the patients we serve.

Big Data may not always be better. As one observer aptly stated, “It isn’t too much to ask sometimes for data-based decisions about data-based decisionmaking.”3


  1. Marr B. How Big Data is changing healthcare. Apr 21, 2015.
  2. Groves P, Kayyali B, Knott D, Van Kuiken S. The “big data” revolution in healthcare. Center for US Health System Reform, Business Technology Office. January 2013.
  3. Richards NM, King JH. Three paradoxes of Big Data. Stanford Law
    Review Online. September 3, 2013.

Don Barton, MS, MT(ASCP), is a laboratory informaticist at Delta County Memorial Hospital in Delta, Colorado. He says his work is like teaching a kindergarten class, except with analyzers and computers, making sure they always talk nicely to each other, and never tell lies.

Photo 241571148 © BiancoBlue |
Photo 75539817 © Vladimirs Prusakovs |
Dreamstime Xxl 75539817
Image by NatalyaBurova @
Coverbackgroundv1 Forstory
Photo 14015956 © Sebastian Czapnik |
Dreamstime Xxl 14015956