Files in this item



application/pdfCUMINGS-DISSERTATION-2019.pdf (2MB)Restricted to U of Illinois
(no description provided)PDF


Title:Three essays on nonparametric estimation
Author(s):Cumings, Ryan
Director of Research:Koenker, Roger
Doctoral Committee Chair(s):Koenker, Roger
Doctoral Committee Member(s):Bera, Anil; Shin, Minchul; Lee, Ji Hyung
Department / Program:Economics
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Optimal Transport
Shape-Constrained Density
Regularized Wasserstein Metric
Modal Regression
Conjugate Gradient
Abstract:The first essay describes a shape constrained density estimator, which, in terms of the assumptions on the functional form of the population density, can be viewed as a middle ground between fully nonparametric and fully parametric estimators. For example, typical constraints require the estimator to be \log-concave or, more generally, \rho-concave; (Koenker and Mizera, 2010). In cases in which the true population density satisfies the shape constraint, these density estimators often compare favorably to their fully nonparametric counterparts; see for example, (Cule et al., 2010; Koenker and Mizera, 2010). The particular shape constrained density estimator proposed here is defined as the minimum of the entropy regularized Wasserstein metric provided by Cuturi (2013), which can be found with a nearly linear time complexity in the number of points in the mesh (Altschuler et al., 2017). It is also a common thread that links all three essays. After providing results on consistency, limiting distribution, and rate of convergence for the estimator, the paper moves onto testing if a population density satisfies a shape constraint. This is done by deriving the limiting distribution of the regularized Wasserstein metric at the estimator. In the interest of tractability these results are initially described in terms of \rho-concave shape constraints. The final result provides the additional requirements that arbitrary shape constraints must satisfy for these results to hold. The generalization is then used to explore whether the California Department of Transportation's decision to award construction contracts with the use of a first price auction is cost minimizing. The shape constraint in this case is given by Myerson's (1981) regularity condition, which is a requirement for the first price auction to be cost minimizing. The next essay provides a novel nonparametric estimator of the mode of a random variable conditional on covariates, which is also known as a modal regression. The estimator is defined by first finding several paths through the data, and then aggregating these paths together with the use of a regularized Wasserstein barycenter. The initial paths each minimize a combination of distance between subsequent points as well as the curvature along the path. The paper provides a result on consistency, and shows that the estimator has a time complexity that is O(n^{2}), where n is the number of points in the sample. An approximation is also provided that has a time complexity of O(n^{1+2\beta}), where \beta\in(0,1/2). Simulations are also provided, and then the estimator is used in an application to find the mode of undergraduate GPA, conditional on high school GPA and college entrance tests on cumulative undergraduate GPA. Note that the estimators provided in these first two essays require minimizing the regularized Wasserstein metric. In their textbook treatment, Peyré and Cuturi (2019) state that second order methods are not an “applicable” approach for optimization in this setting. This is because of the large scale of many applications of interest, as well as because the Hessian is dense, poorly scaled, and not expressible in closed form. In the third paper we provide functions to evaluate products with this Hessian and its inverse that have a nearly linear time complexity. We also provide recommendations to allow these methods to be applied in standard second order optimization implementations, and discuss how in certain favorable cases these approaches have a provably nearly linear time complexity. Afterward, numerical experiments are carried out outside of these favorable cases. When the derivatives of the constraints are sufficiently sparse, the computational efficiency of the approach in this setting is also favorable.
Issue Date:2019-07-11
Rights Information:Copyright 2019 Ryan Cumings
Date Available in IDEALS:2019-11-26
Date Deposited:2019-08

This item appears in the following Collection(s)

Item Statistics