Researchers who analyzed data in the electronic health records (EHR) of children seen by hematology/oncology specialists at three large medical centers have developed an algorithm to accurately identify appropriate pediatric oncology patients for future clinical studies. By expediting and refining the selection of patients for research, the researchers aim to ultimately improve outcomes for a variety of pediatric cancers.
“Accurately identifying patient cohorts is key to designing better research,” said study leader Charles A. Phillips, MD, a pediatric oncologist at Children’s Hospital of Philadelphia (CHOP). “Because not every patient in large datasets would be appropriate for a clinical study, having a tool to separate signals from the noise will help researchers leverage data to design pragmatic, real-world studies in patients with a range of different cancers. For instance, we could better evaluate nausea medicines or detect factors that influence the rates of infections in patients with central line placements.”
Phillips and colleagues published their study online June 17, 2019 in Pediatric Blood and Cancer.
The study team analyzed EHR-derived data in PEDSnet, a national pediatric clinical research network, from 2011 to 2016 at three large pediatric hospital systems: CHOP, Children’s Hospital Colorado and Seattle Children’s Hospital. The EHR data included diagnoses, procedures, medications, laboratory tests and provider specialties.
In contrast to the narrowly defined eligibility requirements and smaller numbers of patients in clinical trials testing drugs in specific subtypes of cancers, said Phillips, studies of supportive care issues in patients with a broader range of cancer diagnoses may draw on already available data in EHR, but accuracy in patient selection is crucial.
“We found that over half of the children referred to an inpatient or outpatient clinic with a leukemia or lymphoma diagnosis in their charts did not actually have cancer,” said Phillips. Some of the patients were survivors with a remote history of cancer, others were seen to rule out a cancer diagnosis, others were miscoded on the charts.” He added that a single, isolated diagnostic code may not be reliable, in contrast to multiple diagnoses.
Therefore, in this study, Phillips and colleagues created a “computable phenotype,” automating their search algorithm to check off a series of boxes: starting with at least three visits to a pediatric hematologist-oncologist (27,450 patients), then at least one leukemia or lymphoma diagnosis, which narrowed the number to 4,535. A further screen required the three specialist visits, at least two diagnostic codes and at least two administrations of chemotherapy—which winnowed the total to 1,825 patients. The final group of 1,825 was the computable phenotype curated cohort—suitable as a clinical study group.
When reviewers analyzed that cohort’s full medical records in masked reviews, the computable phenotype showed 100 percent sensitivity and 99 to 100 percent specificity in accurately classifying the patients as having pediatric leukemia or lymphoma.
“This algorithm can accurately and efficiently narrow down the number of medical charts researchers need to review to identify a patient cohort for subsequent clinical studies,” said Phillips. Although he added that further studies may be needed to refine the algorithm to meet their study-specific needs, it offers a potential new tool to clinical researchers in improving outcomes for children with leukemia or lymphoma, who represent about 40 percent of all U.S. pediatric cancers.
The National Institutes of Health provided funding for this study (grant HD060550), as did the Patient-Centered Outcomes Research Institute (PCORI). In addition to his CHOP position, Phillips and several co-authors are faculty members of the Perelman School of Medicine at the University of Pennsylvania.