## Files in this item

FilesDescriptionFormat

application/pdf

Levine_Geoffrey.pdf (3MB)
(no description provided)PDF

## Description

 Title: Toward automatic model adaptation for structured domains Author(s): Levine, Geoffrey C. Director of Research: DeJong, Gerald F. Doctoral Committee Chair(s): DeJong, Gerald F. Doctoral Committee Member(s): Roth, Dan; Forsyth, David A.; Kuter, Ugur Department / Program: Computer Science Discipline: Computer Science Degree Granting Institution: University of Illinois at Urbana-Champaign Degree: Ph.D. Genre: Dissertation Subject(s): Artificial Intelligence Machine Learning Statistics Explanation-Based Learning Natural Language Processing Abstract: In order for a machine learning effort to succeed, an appropriate model must be chosen. This is a difficult task in which one must balance flexibility, so that the model can capture the complexities of the domain, and simplicity, so that the model does not overfit to irrelevant characteristics of the training data. The optimal model is not only a function of the task to which it is applied, but also the amount of training data available. Copious training data can justify a complex model that includes many of the true'' domain interaction. But when training data is limited, additional simplifications are necessary. Traditional model selection techniques, that require fitting each of a number of hypothesized models to the training data before selecting one, apply in theory, but are not feasible when the number of possible models is large. In this thesis, we describe steps in a new direction for automatically adapting model flexibility. Our approach leverages prior knowledge of two forms: 1) Qualitative knowledge statements, which describe positive and negative relationships between domain variables, and 2) Structural metadata, which provide categorical assignments for each training instance. In our approach, this prior knowledge is used to implicitly construct a large space of alternative well-formed models. A model adaptation procedure then utilizes the training data to conduct a directed search through the space of possible models. The search requires that relatively few models be fit to the data. Thus, the search is efficient and the risk of overfitting in the model selection process is minimized. We demonstrate our approaches on a variety of machine learning tasks, including military airspace safety prediction, planning operator construction, sports prediction, and document sentiment analysis. Issue Date: 2012-02-06 URI: http://hdl.handle.net/2142/29711 Rights Information: Copyright 2011 Geoffrey Levine Date Available in IDEALS: 2012-02-06 Date Deposited: 2011-12
﻿