Files in this item



application/pdfAlla_Rozovskaya.pdf (888kB)
(no description provided)PDF


Title:Automated methods for text correction
Author(s):Rozovskaya, Alla
Director of Research:Roth, Dan
Doctoral Committee Chair(s):Cole, Jennifer S.
Doctoral Committee Member(s):Roth, Dan; Hockenmaier, Julia C.; Hirst, Graeme
Department / Program:Linguistics
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):text correction
grammatical error correction
English as a second language (ESL) error correction
automated methods for text correction
Abstract:Development of automatic text correction systems has a long history in natural language processing research. This thesis considers the problem of correcting writing mistakes made by non-native English speakers. We address several types of errors commonly exhibited by non-native English writers – misuse of articles, prepositions, noun number, and verb properties – and build a robust, state-of-the-art system that combines machine learning methods and linguistic knowledge. The proposed approach is distinguished from other related work in several respects. First, several machine learning methods are compared to determine which methods are most effective for this problem. Earlier evaluations, because they are based on incomparable data sets, have questionable conclusions. Our results reverse these conclusions and pave the way for the next contribution. Using the important observation that mistakes made by non-native writers are systematic, we develop models that utilize knowledge about error regularities with minimal annotation costs. Our approach differs from earlier ones that either built models that had no knowledge about error regularities or required a lot of annotated data. Next, we develop special strategies for correcting errors on open-class words. These errors, while being very prevalent among non-native English speakers, are the least studied and are not well-understood linguistically. The challenges that these mistakes present are addressed in a linguistically-informed approach. Finally, a novel global approach to error correction is proposed that considers grammatical dependencies among error types and addresses these via joint learning and joint inference. The systems and techniques described in this thesis are evaluated empirically and competitively in the context of several shared tasks, where they have demonstrated superior performance. In particular, our system ranked first in the most prestigious competition in the natural language processing field, the CoNLL-2013 shared task on text correction. Based on the analysis of this system, four design principles that are crucial for building a state-of-the-art error correction system are identified.
Issue Date:2014-01-16
Rights Information:Copyright 2013 Alla Rozovskaya
Date Available in IDEALS:2014-01-16
Date Deposited:2013-12

This item appears in the following Collection(s)

Item Statistics