Files in this item
|(no description provided)|
|Title:||The impact of measurement scale on classification performance of inductive learning and statistical approaches|
|Doctoral Committee Chair(s):||Chandler, John S.|
|Department / Program:||Accountancy|
|Degree Granting Institution:||University of Illinois at Urbana-Champaign|
|Subject(s):||Business Administration, Accounting
|Abstract:||This thesis is a comparative study of inductive learning and statistical methods. The focus of this study is to investigate the impact of measurement scale of explanatory variables on the relative performance of the statistical method (probit) and the inductive learning method (ID3). In addition, the impact of correlation structure on the classification behavior of the probit method and the ID3 method is examined.
A comparative analysis of the ID3 method and the probit method indicates that the differences in distribution assumption, the relationship between independent and dependent variables, and the modeling basis between probit and ID3 have, to a large extent, originated from the different assumptions on the measurement scale for independent variables between the two methods. The theoretical discussion leads to hypothetical statements that the ID3 performs relatively better with nominal variables and that the probit method performs relatively better with numeric variables.
In the empirical test, simulated data are used to provide generalizable background results. The equality of covariance matrices and the magnitude of correlations are manipulated in addition to the measurement scale. Next, accounting data (bankruptcy prediction) are tested to obtain results more applicable to accounting domain. ANOVA and regression analysis are used to investigate the statistical significance of the impact of the measurement scale, the equality of covariance matrices, and the magnitude of correlations on the classification performance of ID3 and probit.
The main hypothesis, that the relative classification accuracy of the ID3 method to the probit method increases as the proportion of binary variables increases in the classification model, is confirmed by the results from both simulated data and bankruptcy data. The empirical results also show that the relative classification accuracy of the ID3 method to the probit method is higher when the variances are unequal among populations than when the variances are equal among populations.
|Rights Information:||Copyright 1990 Han, Ingoo|
|Date Available in IDEALS:||2011-05-07|
|Identifier in Online Catalog:||AAI9114256|