The Michael J. Fox Foundation (MJFF) collected multifaceted data sets from patients with Parkinson’s Disease. Research programs, treatment clinics, and physician’s offices vary in the types of data and medical test results they collect on these patients. The BioFIND observational clinic study, designed to discover and verify biomarkers of Parkinson’s disease, contains results from nearly 2,500 biomarker and other medical tests, but none of its approximately 200 participants have results for all the tests. Despite the extreme variability in patient data, the client was very eager to determine tests that would offer the greatest value in disease prediction and whether the importance of a key test would be affected when data from further medical tests becomes available.
A proprietary clustering method was used to identify twelve clusters of patients based on the test results available for each patient. Prior to model selection, we used algorithms to produce two lists of medical tests for each cluster, a list that included all tests the algorithm found useful for disease prediction, and a minimal test list, representing the smallest set of tests needed for accurate disease prediction. Target shuffling was used to further validate the statistical accuracy of our model results. By target shuffling over 300 times, we confidently estimated the performance of our models.
Our models had better accuracy and specificity metrics than those obtained from target shuffling nearly 100% of the time for 10 of the 12 patient clusters.
Our models identified several biomarkers that are important in predicting Parkinson’s disease. Deployed in a clinic, our solution will help clinicians diagnose Parkinson’s based on available tests and recommend the fewest additional (or next best) tests to improve disease prediction. The analytical processes used can be applied to many other disease classification targets such as predicting the occurrence of other diseases, the speed of disease progression, the effectiveness of treatments, and indicating dominant or worsening symptoms, enabling doctors to improve treatment decisions.