Files in this item

FilesDescriptionFormat

application/pdf

application/pdf8218524.pdf (7MB)Restricted to U of Illinois
(no description provided)PDF

Description

Title:Outliers in a Linear Regression Model
Author(s):Miyashita, Hiroshi
Department / Program:Economics
Discipline:Economics
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):Statistics
Abstract:Over the last several decades the linear regression model has become one of the most widely used tools of the social sciences and the physical sciences. Given the data, the least squares method gives information for statistical inferences. However, the researcher frequently feels that the regression results are not trustworthy because of possible problems with the data. These problems have sometimes been ignored in practice. It is absurd that we include all data without question if some of the data are in error, or they come from a different regime. Those data are called outliers and should be excluded from the sample or at least treated carefully.
Several test statistics for detecting outliers have been developed. However, the tests based on those statistics usually require the assumption of error normality. If the underlying error distribution deviates from the normal, the test is not trustworthy. This is confirmed by a simulation study. Even if the error distribution is normal, it is computationally burdensome and sometimes impossible to locate more than one outlier correctly. Therefore, it is impossible to detect outliers if the error distribution is non-normal or there is a possibility of having more than one outlier. One solution to this problem is a Bayesian approach.
The test of significance has little relevance in the context of a Bayesian approach. We accommodate outliers rather than detect and drop them. Furthermore, we don't have to know how many outliers exist in the sample. By constructing an appropriate prior distribution of having outliers, we can derive a posterior distribution of the regression parameters. The underlying error distribution is not restricted to the normal. Introducing a class of symmetric exponential power distributions which includes the normal as a special case, we can handle the situation in which the error distribution is assumed to be non-normal. Hypothesis testing can be done by constructing a Bayesian confidence interval. Using the interval we can test a null hypothesis in the possible presence of outliers.
Issue Date:1982
Type:Text
Description:228 p.
Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1982.
URI:http://hdl.handle.net/2142/70725
Other Identifier(s):(UMI)AAI8218524
Date Available in IDEALS:2014-12-16
Date Deposited:1982


This item appears in the following Collection(s)

Item Statistics