Degree Type


Date of Award


Degree Name

Master of Science


Theses & dissertations (College of Business)


Information Systems

First Advisor

Sree Nilakanta

Second Advisor

Baskar Ganapathysubramanian


In healthcare, a tremendous amount of clinical, laboratory tests, imaging, prescription and medication data are collected. Big data analytics on these data aim at early detection of disease which will help in developing preventive measures and in improving patient care. Parkinson disease is the second-most common neurodegenerative disorder in the United States. To find a cure for Parkinson's disease biological, clinical and behavioral data of different cohorts are collected, managed and propagated through Parkinson’s Progression Markers Initiative (PPMI). Applying big data technology to this data will lead to the identification of the potential biomarkers of Parkinson’s disease. Data collected in human clinical studies is imbalanced, heterogeneous, incongruent and sparse. This study focuses on the ways to overcome the challenges offered by PPMI data which is wide and incongruent. This work leverages the initial discoveries made through descriptive studies of various attributes. The exploration of data led to identifying the significant attributes. This research project focuses on data munging or data wrangling, creating the structural metadata, curating the data, imputing the missing values, using the emerging big data analysis methods of dimensionality reduction, supervised machine learning on the reduced dimensions dataset, and finally an interactive visualization. The simple interactive visualization platform will abstract the domain expertise from the sophisticated mathematics and will enable a democratization of the exploration process. Visualization build on D3.Js is interactive and will enable manual exploration of traits that correlate with the disease severity.


Copyright Owner

Mahalakshmi Senthilarumugam Veilukandammal



File Format


File Size

52 pages