Dissertations and Theses - Statistics
http://hdl.handle.net/2142/17362
Tue, 06 Oct 2015 14:45:02 GMT2015-10-06T14:45:02ZWeak signal identification and inference in penalized model selection
http://hdl.handle.net/2142/88025
Weak signal identification and inference in penalized model selection
Weak signal identification and inference are very important in the area of penalized model selection, yet they are under-developed and not well-studied. Existing inference procedures for penalized estimators are mainly focused on strong signals. This thesis propose an identification procedure for weak signals in finite samples, and provide a transition phase in-between noise and strong signal strengths. A new two-step inferential method is introduced to construct better confidence intervals for the identified weak signals. Both theory and numerical studies indicate that the proposed method leads to better confidence coverage for weak signals, compared with those using asymptotic inference. In addition, the proposed method outperforms the perturbation and bootstrap resampling approaches. The method is illustrated for HIV antiretroviral drug susceptibility data to identify genetic mutations associated with HIV drug resistance.
We also provide signal's inference method based on the exact distribution of penalized estimator. The finite sample distribution is quite different from its asymptotic counterpart, which can be highly non-normal with a point mass at zero. Numerical studies indicate that the density-based approach works well when true parameter is moderately large. However, it cannot provide accurate inference when signal is weak.
model selection; weak signal; inference
Thu, 16 Jul 2015 00:00:00 GMThttp://hdl.handle.net/2142/880252015-07-16T00:00:00ZBuilding a Nonparametric Model After Dimension Reduction
http://hdl.handle.net/2142/87422
Building a Nonparametric Model After Dimension Reduction
To effectively build a regression model with a large number of covariates is no easy task. We consider using dimension reduction before building a parametric or spline model. The dimension reduction procedure is based on a canonical correlation analysis on the predictor variables and a spline basis generated for the response variable. One important question in dimension reduction is to decide on the number of effective dimensions needed. We study four tests of dimensionality: a chi-square test, a Wald-type test on eigenvalues, a modified Wald-type test, and a matrix rank test. These tests are motivated from different aspects of the problem and have their own strength and weakness. We discuss and compare these tests both theoretically and through Monte Carlo simulations, based on which specific recommendations for determining dimensionality are made. Additive regression splines are first fitted to the data in the space of reduced dimensionality. A Tukey-type test of additivity is proposed and compared with Rao's score test. When the hypothesis of additivity is rejected, tensor product splines can be used for model building.
Statistics
Sat, 01 Jan 2000 00:00:00 GMThttp://hdl.handle.net/2142/874222000-01-01T00:00:00ZContributions to Estimation in Item Response Theory
http://hdl.handle.net/2142/87423
Contributions to Estimation in Item Response Theory
In the logistic item response theory models, the number of parameters tends to infinity together with the sample size. Thus, there has been a longstanding question of whether the joint maximum likelihood estimates for these models are consistent. The main contribution of this paper is the study of the asymptotic properties and computation of the joint maximum likelihood estimates, as well as an alternative estimation procedure, one-step estimation. The one-step estimates are much easier to compute, yet are consistent and first-order equivalent to the joint maximum likelihood estimates under certain conditions on the sample sizes if the marginal distribution of the ability parameter is correctly specified. The one-step estimates are also highly robust against modest misspecifications of the ability distribution. We also study the accuracy of variance estimates for the one-step estimates. Finally, we study tests of the goodness of fit for the models. We show that Rao's score test is superior to the existing chi-square tests.
Statistics
Sat, 01 Jan 2000 00:00:00 GMThttp://hdl.handle.net/2142/874232000-01-01T00:00:00ZRank-Based Procedures for Some Multivariate Problems
http://hdl.handle.net/2142/87418
Rank-Based Procedures for Some Multivariate Problems
The remainder of my thesis studies multivariate symmetry. Several well-studied parametric and nonparametric tests for univariate symmetry are extended to multivariate settings. We study the asymptotic distributions of these multivariate tests, and we propose a new criterion to compare their asymptotic relative efficiency. Monte Carlo simulation studies shed some light on the performance of these test statistics.
Statistics
Fri, 01 Jan 1999 00:00:00 GMThttp://hdl.handle.net/2142/874181999-01-01T00:00:00ZSome Issues in Weak Local Independence in Item Response Theory
http://hdl.handle.net/2142/87417
Some Issues in Weak Local Independence in Item Response Theory
A method of Pashley and Reese (in press) for simulating locally dependent data, whose item pair dependencies relate directly to local dependence assessment procedures is discussed. This method is extended to allow for the simulation of local dependence that varies in magnitude across the examinee ability levels. The parameter that Yen's (1984) Q3 statistic seeks to estimate is examined. The notion of the `best fit' latent trait is defined, and it is shown that an exam with only positive pairwise item dependencies violates the assumptions of a large class of exam models. Finally, theoretical probability is used to rigorously define the basic concepts of IRT, and is then applied to construct conditions under which the converse to a theorem of Reckase and Stout (1995) holds.
Education, Educational Psychology
Thu, 01 Jan 1998 00:00:00 GMThttp://hdl.handle.net/2142/874171998-01-01T00:00:00ZAssessing Unidimensionality of Test Items and Some Asymptotics of Parametric Item Response Theory
http://hdl.handle.net/2142/87420
Assessing Unidimensionality of Test Items and Some Asymptotics of Parametric Item Response Theory
Using results from He & Shao (2000), a proof of the consistency and asymptotic normality of item parameter estimates obtained from the Marginal Maximum Likelihood Estimation (Bock & Lieberman, 1970) procedure as both the number of examinees and the number of items tends to infinity is presented. The proof depends upon fairly general regularity conditions on the model and on the growth of the number of items relative to the number of examinees.
Psychology, Psychometrics
Sat, 01 Jan 2000 00:00:00 GMThttp://hdl.handle.net/2142/874202000-01-01T00:00:00ZCensored Regression Models With Applications to Infrastructure Degradation Studies
http://hdl.handle.net/2142/87421
Censored Regression Models With Applications to Infrastructure Degradation Studies
Motivated by consulting in infrastructure studies, we consider the estimation and inference for regression models where the response variable is bounded or censored. In these conditions, least squares methods are not appropriate, although they are widely used. This dissertation develops a generalization of the Tobit censored regression model using Student's t distribution instead of the normal. Thus additional flexibility is achieved varying the degrees of freedom. The variance function can be estimated by using additional regression steps. Nonparametric methods are extended to apply to bounded or censored data. For correlated measurements, random effect models and generalized estimating equation methods for censored data are developed. A general argument about the theoretical bias for the GEE method, both in the censored and uncensored case, provides a criteria for deciding when marginal analysis is appropriate. We apply the methods to a cross-sectional study of factors influencing roof condition as a function of age and a mixed cross-sectional and longitudinal study on road conditions.
Statistics
Sat, 01 Jan 2000 00:00:00 GMThttp://hdl.handle.net/2142/874212000-01-01T00:00:00ZUnified Ordinal Regression: Model Assessment and Semiparametric Analysis
http://hdl.handle.net/2142/87419
Unified Ordinal Regression: Model Assessment and Semiparametric Analysis
In the last part of the thesis we discuss the methods for handling censored data for both categorical and continuous responses. We develop EM algorithms for censored data in general, and also develop a weighted least squares algorithm for cumulative link models for ordinal responses.
Economics, General
Sat, 01 Jan 2000 00:00:00 GMThttp://hdl.handle.net/2142/874192000-01-01T00:00:00ZDIMTEST Enhancements and Some Parametric IRT Asymptotics
http://hdl.handle.net/2142/87416
DIMTEST Enhancements and Some Parametric IRT Asymptotics
The joint consistency of item and ability parameter estimation remains a challenging problem in IRT parametric modeling. Although many simulation studies have been conducted on the item and ability parameter estimates obtained by joint maximum likelihood estimation which is implemented in LOGIST (Wingersky, Bartaon, & Lord, 1982) procedure, there is no analytical results about the asymptotic properties of these estimates in literature. A preliminary effort is made to joint consistently estimate the item and ability parameters using a new approach under some regularity conditions. It is shown that when uniformly consistent ability parameter estimates are available and used as the true ability values, consistent item parameter estimates exist in a subsequence of their MLE estimates assuming ability parameters are known.
Statistics
Wed, 01 Jan 1997 00:00:00 GMThttp://hdl.handle.net/2142/874161997-01-01T00:00:00ZDimensionality of Data Matrices With Applications to Gene Expression Profiles
http://hdl.handle.net/2142/87414
Dimensionality of Data Matrices With Applications to Gene Expression Profiles
Finally, a robust extension of the dimensionality tests is discussed, and a real example is used to demonstrate the merit of the robust alternative.
Statistics
Thu, 01 Jan 2009 00:00:00 GMThttp://hdl.handle.net/2142/874142009-01-01T00:00:00Z