Proteomics-based, multivariate, random forest method for prediction of protein separation behavior during downstream purification

Thumbnail Image
Date
2013-01-01
Authors
Swanson, Ryan
Major Professor
Advisor
Charles E. Glatz
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Authors
Research Projects
Organizational Units
Journal Issue
Is Version Of
Versions
Series
Department
Chemical and Biological Engineering
Abstract

The downstream purification process (DSP) remains a significant bottleneck when using biological expression hosts for the production of recombinant biologics. This issue persists in part because of a lack of knowledge of the separation behavior of the host cell proteins (HCP), which are the most problematic class of impurity to remove due to similarities in separation behavior with the target. The process of selecting the DSP method(s) as well as the host cell can benefit from an accurate prediction of the HCP separation behavior. Therefore, to reduce the effort required for DSP development, this work was aimed at characterizing the separation behavior of a complex mixture of proteins during four commonly used chromatographic and non-chromatographic methods: cation-exchange chromatography (CEX), anion-exchange chromatography (AEX), hydrophobic interaction chromatography (HIC) and ammonium sulfate precipitation (ASP). An additional goal was to evaluate the performance of a statistical methodology as a tool for predicting the separation behavior after being applied to the characterization data. Aqueous two-phase partitioning (ATPS) followed by two-dimensional electrophoresis (2DE) provided data on the three physicochemical properties most commonly exploited during DSP for each HCP; pI (isoelectric point), molecular weight and surface hydrophobicity. The separation behaviors of two separate biological expression host extracts (corn germ and E. coli) were characterized for multiple purification methods creating a database of characterized HCP for each purification method-expression host combination (e.g. CEX-corn germ; AEX-E. coli, AEX-corn germ, ASP-E. coli, etc.). A multivariate random forest (MVRF) statistical methodology was then applied to the chromatography-based purification method databases of characterized proteins creating an accurate tool for predicting the separation behavior of a mixture of proteins. The accuracy of the MVRF method was determined by calculating a root mean squared error (RMSE) value for each database. This measure never exceeded a value of 0.045 (fraction of protein populating each of the multiple separation fractions for a given mode of chromatography). In addition, simultaneous analysis of the empirical results from AEX (i.e. chromatograms) for both expression hosts together with the predicted elution profiles of a set of model proteins using the MVRF methodology will allow for an upstream decision to be made regarding which of the two expression hosts would result in a simpler downstream purification process by using product purity and yield as a guide. Overall, the current study was aimed at establishing the framework for designing a successful downstream process with minimal resources or time spent in the lab.

Comments
Description
Keywords
Citation
DOI
Source
Subject Categories
Copyright
Tue Jan 01 00:00:00 UTC 2013