Files in this item

FilesDescriptionFormat

application/pdf

application/pdfZWILLING-DISSERTATION-2017.pdf (1MB)Restricted to U of Illinois
(no description provided)PDF

Description

Title:New approaches for outlier detection
Author(s):Zwilling, Christopher Eric
Director of Research:Wang, Michelle Y.
Doctoral Committee Chair(s):Wang, Michelle Y.
Doctoral Committee Member(s):Anderson, Carolyn; Kohn, Hans-Friedrich; Marden, John; Drasgow, Fritz
Department / Program:Psychology
Discipline:Psychology
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):Outliers
Covariance
Time series
Abstract:Outlier detection has relevance in many modern day contexts, including health care, engineering, data processing and analysis, credit card fraud, monitoring computer and internet intrusions and wearable personal health sensors. Outlier detection once represented a single pre-processing step, completed prior to the analysis of data proper. Today it has importance in all stages of the data analysis pipeline, from initial processing to defining data points of interest, such as when a sensor detects an anomaly. Moreover, as data sets have grown to encompass millions and billions of observations and variables, it is imperative to have outlier detection methods capable of effectively and automatically winnowing through large amounts of data with few or no inputs from a data analyst. Many existing outlier detection methods are constrained in certain ways which might limit their utility and efficacy. For instance, it is not uncommon for outlier detection methods to require some knowledge about the data under study or require the analyst to specify information about the number of outliers in the data. Another possible constraint of many outlier detection methods is the use of the raw data. Sometimes outliers can readily be detected in the raw data; but sometimes not, in which case one can achieve greater sensitivity and accuracy from features derived from data. This study uses feature extraction on multivariate time series data and demonstrates the efficacy of a set of features and their potential for aggregation through the use of Voronoi diagrams. Voronoi diagrams are constructed from the data to create tessellations which satisfy certain geometric properties. The covariance based outlier detection is proposed and demonstrated to addresses both of these challenges. It utilizes covariance information in the data and its efficacy lies in its ability to take a set of features constructed from the data and determine which feature is best at detecting outliers. The method is shown to work effectively on time series data; but it is general and can be applied or extended to other types of data objects and data sets.
Issue Date:2016-09-15
Type:Thesis
URI:http://hdl.handle.net/2142/97645
Rights Information:Copyright 2017 Christopher Eric Zwilling
Date Available in IDEALS:2017-08-10
Date Deposited:2017-05


This item appears in the following Collection(s)

Item Statistics