Files in this item



application/pdfJiaming_Xu.pdf (769kB)
(no description provided)PDF


Title:Statistical inference in networks: fundamental limits and efficient algorithms
Author(s):Xu, Jiaming
Director of Research:Hajek, Bruce
Doctoral Committee Chair(s):Hajek, Bruce
Doctoral Committee Member(s):Srikant, R.; Oh, Sewoong; Sanghavi, Sujay; Lelarge, Marc
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Community detection
Statistical inference
Abstract:Today witnesses an explosion of data coming from various types of networks such as online social networks and biological networks. The goal of this thesis is to understand when and how we can efficiently extract useful information from such network data. In the first part, we are interested in finding tight-knit communities within a network. Assuming the network is generated according to a planted cluster model, we derive a computationally efficient semidefinite programming relaxation of the maximum likelihood estimation method and obtain a stronger performance guarantee than previously known. If the community sizes are linear in the total number of vertices, the guarantee matches up to a constant factor with the information limit which we also identify, and exactly matches without a constant gap when there is a single community or two equal-sized communities. However, if the community sizes are sublinear in the total number of vertices, the guarantee is far from the information limit. We conjecture that our algorithm achieves the computational limit below which no polynomial-time algorithm can succeed. To provide evidence, we show that finding a community in some regime below the conjectured computational limit but above the information limit is computationally intractable, assuming hardness of the well-known planted clique problem. The second part studies the problem of inferring the group preference for a set of items based on the partial rankings over different subsets of the items provided by a group of users. A question of particular interest is how to optimally construct the graph used for assigning items to users for ranking. Assuming the partial rankings are generated independently according to the Plackett-Luce model, we analyze computationally efficient estimators based on maximum likelihood and rank-breaking schemes that decompose partial rankings into pairwise comparisons. We provide upper and lower bounds on the estimation error. The lower bound depends on the degree sequence of the assignment graph, while the upper bound depends on the spectral gap of the assignment graph. When the graph is an expander, the lower and upper bounds match up to a logarithmic factor. The unifying theme for the two parts of the thesis is the spectral gap of the graph. In both cases, when the graph has a large spectral gap, accurate and efficient inference is possible via maximum likelihood estimation or its convex relaxation. However, when the spectral gap vanishes, accurate inference may be statistically impossible, or it is statistically possible but may be computationally intractable.
Issue Date:2015-01-21
Rights Information:Copyright 2014 Jiaming Xu
Date Available in IDEALS:2015-01-21
Date Deposited:2014-12

This item appears in the following Collection(s)

Item Statistics