Files in this item



application/pdfUILU-ENG-07-2201_DC-226 assembled.pdf (505kB)
(no description provided)PDF


Title:Classification via Minimum Incremental Coding Length (MICL)
Alternative Title:Classification via Minimum Incremental Coding
Author(s):Wright, John; Ma, Yi; Tao, Yangyu; Lin, Zhuochen; Shum, Heung-Yeung
Lossy compression
Abstract:We present a simple new criterion for classification, based on principles from lossy data compression. The criterion assigns a test sample to the class that uses the minimum number of additional bits to code the test sample, subject to an allowable distortion. We rigorously prove asymptotic optimality of this criterion for Gaussian data and analyze its relationships to classical classifiers. The theoretical results provide new insights into the relationships among a variety of popular classifiers such as MAP, RDA, k-NN, and SVM. Our formulation induces several good effects on the resulting classifier. First, minimizing the lossy coding length induces a regularization effect which stabilizes the (implicit) density estimate in a small sample setting. Second, compression provides a uniform means of handling classes of varying dimension. The new criterion and its kernel and local versions perform competitively on synthetic examples, as well as on real imagery data such as handwritten digits and face images. On these problems, the performance of our simple classifier approaches the best reported results, without using domain-specific information. All MATLAB code and classification results will be made publicly available for peer evaluation.
Issue Date:2007-01
Publisher:Coordinated Science Laboratory, University of Illinois at Urbana-Champaign
Series/Report:Coordinated Science Laboratory Report no. UILU-ENG-07-2201, DC-226
Genre:Technical Report
Sponsor:National Science Foundation / Career IIS-0347456, CRS-EHS-0509151, and CCF-TF-0514955
ONR / YIP N00014-05-1-0633
Date Available in IDEALS:2018-04-03

This item appears in the following Collection(s)

Item Statistics