Files in this item



application/pdf8218524.pdf (7MB)Restricted to U of Illinois
(no description provided)PDF


Title:Outliers in a Linear Regression Model
Author(s):Miyashita, Hiroshi
Department / Program:Economics
Degree Granting Institution:University of Illinois at Urbana-Champaign
Abstract:Over the last several decades the linear regression model has become one of the most widely used tools of the social sciences and the physical sciences. Given the data, the least squares method gives information for statistical inferences. However, the researcher frequently feels that the regression results are not trustworthy because of possible problems with the data. These problems have sometimes been ignored in practice. It is absurd that we include all data without question if some of the data are in error, or they come from a different regime. Those data are called outliers and should be excluded from the sample or at least treated carefully.
Several test statistics for detecting outliers have been developed. However, the tests based on those statistics usually require the assumption of error normality. If the underlying error distribution deviates from the normal, the test is not trustworthy. This is confirmed by a simulation study. Even if the error distribution is normal, it is computationally burdensome and sometimes impossible to locate more than one outlier correctly. Therefore, it is impossible to detect outliers if the error distribution is non-normal or there is a possibility of having more than one outlier. One solution to this problem is a Bayesian approach.
The test of significance has little relevance in the context of a Bayesian approach. We accommodate outliers rather than detect and drop them. Furthermore, we don't have to know how many outliers exist in the sample. By constructing an appropriate prior distribution of having outliers, we can derive a posterior distribution of the regression parameters. The underlying error distribution is not restricted to the normal. Introducing a class of symmetric exponential power distributions which includes the normal as a special case, we can handle the situation in which the error distribution is assumed to be non-normal. Hypothesis testing can be done by constructing a Bayesian confidence interval. Using the interval we can test a null hypothesis in the possible presence of outliers.
Issue Date:1982
Description:228 p.
Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1982.
Other Identifier(s):(UMI)AAI8218524
Date Available in IDEALS:2014-12-16
Date Deposited:1982

This item appears in the following Collection(s)

Item Statistics