Learning in the presence of sudden concept drift and measurement drift
Date
Authors
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Authors
Research Projects
Organizational Units
Journal Issue
Is Version Of
Versions
Series
Department
Abstract
The current availability of vast data storage and the computational power to enact algorithms for interpreting that data in real time leads to the possibility of real time adaptive systems. Because change is nearly always inevitable, companies must strive to increase the adaptability of their manufacturing or service systems. To accomplish this, the methods for correcting the system and determining the correct change point must be studied.
The motivation of this thesis is advancing the ability of proper prediction and classification model learning on data streams containing change. This problem is known as concept drift. Motivation also stems from a study on a system with these properties, at an active manufacturing facility. After reviewing articles relating to the specific problem in the study, a similarity between the study and the studies performed in the research area of advanced process control became clear.
The underlying cause for the change in the manufacturing system is identified as measurement drift. The identification of measurement drift is explained. A discussion of the mathematical model representing measurement drift is provided.
Existing concept drift algorithms are adapted to fit the needs of the measurement drift problem. Their performance on the data from the study and synthetic data sets mimicking varying levels of drift magnitude and frequency is assessed. The results are compared to a popular advanced process control method, exponential weighted moving average adapting intercept (EWMA-I).
The advanced process control literature inspired the development of two new methods for learning in the presence of concept drift. The methods, ADMEAN and CD-EWMA (ADaptive MEAN and Concept Drift Exponential Weighted Moving Average), make changes to the incoming stream of independent variables. The performance of these algorithms on the measurement drift datasets and synthetic concept drift datasets is provided.