Files in this item



application/pdfYAO-DISSERTATION-2017.pdf (1MB)Restricted to U of Illinois
(no description provided)PDF


Title:Dependence testing in high dimension
Author(s):Yao, Shun
Director of Research:Zhang, Xianyang
Doctoral Committee Chair(s):Shao, Xiaofeng
Doctoral Committee Member(s):Marden, John I; Simpson, Douglas G; Zhao, Sihai
Department / Program:Statistics
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Non-linear dependence
High dimensionality
Distance covariance
Abstract:The study of dependence for high dimensional data originates in many different areas of contemporary research. While a lot of existing work focuses on measuring the linear dependence and monotone dependence for fixed dimensional data, comparatively less is concerned for more complex dependence structure, especially when the dimension is allowed to grow. In this thesis, we propose different testing procedures for various independence/dependence related statistical testing problems in high dimension. In the first part of the thesis, we introduce sum-of-square type tests for testing mutual independence and banded dependence structure for high dimensional data. The test is constructed based on the pairwise distance covariance and it accounts for the non-linear and non-monotone dependencies among the data. Our test can be conveniently implemented in practice as the limiting null distribution of the test statistic is shown to be standard normal. It exhibits excellent finite sample performance in our simulation studies even when sample size is small albeit dimension is high, and is shown to successfully identify nonlinear dependence in empirical data analysis. On the theory side, asymptotic normality of our test statistic is shown under quite mild moment assumptions and with little restriction on the convergence rate of the dimension as a function of sample size. As a demonstration of good power properties for our distance covariance based test, we further show that an infeasible version of our test statistic has the rate optimality in the class of Gaussian distribution with equal correlation. In the second part, we study distance covariance and related independence test in the high dimension, low sample size setting. We show that the sample distance covariance between two random vectors can be approximated by the sum of squared component-wise sample cross-covariance up to a constant factor. This demonstrates that the distance covariance can only capture the linear dependence in high dimension. As a result, it is shown that the distance correlation based "joint" test developed by Székely and Rizzo (2013a) for independence only has trivial power when the two random vectors are nonlinearly dependent but component-wisely uncorrelated. This phenomenon is further confirmed in our simulation study. As a remedy, we propose a distance covariance based "marginal" test and show its superior power behavior against its "joint" counterpart.
Issue Date:2017-07-14
Rights Information:Copyright 2017 Shun Yao
Date Available in IDEALS:2018-03-02
Date Deposited:2017-08

This item appears in the following Collection(s)

Item Statistics