## Files in this item

FilesDescriptionFormat

application/pdf

Li_Yun.pdf (367kB)
(no description provided)PDF

## Description

 Title: Binary classification with training under both classes Author(s): Li, Yun Advisor(s): Veeravalli, Venugopal V. Department / Program: Electrical & Computer Eng Discipline: Electrical & Computer Engr Degree Granting Institution: University of Illinois at Urbana-Champaign Degree: M.S. Genre: Thesis Subject(s): binary hypothesis testing countably infinite alphabet binary classification Stein's lemma finite sample performance Abstract: This thesis focuses on the binary classification problem with training data under both classes. We first review binary hypothesis testing problems and present a new result on the case of countably infinite alphabet. The goal of binary hypothesis testing is to decide between the two underlying probabilistic processes. Asymptotic optimality of binary hypothesis testing can be achieved with the knowledge of only one of the processes. It is also shown that the finite sample performance could improve greatly with additional knowledge of the alternate process. Most previous work focuses on the case where the alphabet is finite. This thesis extends the existing results to the case of countably infinite alphabet. It is proved that, without knowledge of the alternate process, the worst-case performance of any test is arbitrarily bad, even when the alternate process is restricted to be far'' in the sense of relative entropy. Binary classification problems arise in applications where a full probabilistic model of either of the processes is absent and pre-classified samples from both of the processes are available. It is known that asymptotic optimality can be achieved with the knowledge of only one pre-classified training sequence. We propose a classification function that depends on both training sequences. Then Stein's lemma for classification is proved using this new classification function. It states that the maximal error exponent under one class is given by the relative entropy between the conditional distributions of the two classes. Our results also shed light on how the classification errors depend on the relative size of the training and test data. It is shown in the simulation results that our classification method outperforms the asymptotically optimal one when the test samples are of limited size. Issue Date: 2012-09-18 URI: http://hdl.handle.net/2142/34344 Rights Information: Copyright 2012 Yun Li Date Available in IDEALS: 2012-09-18 Date Deposited: 2012-08
﻿