Files in this item



application/pdfSEWELL-DISSERTATION-2015.pdf (9MB)
(no description provided)PDF


Title:Statistical models and inference for dynamic networks
Author(s):Sewell, Daniel K
Director of Research:Chen, Yuguo
Doctoral Committee Chair(s):Chen, Yuguo
Doctoral Committee Member(s):Qu, Annie; Liang, Feng; Marden, John I.
Department / Program:Statistics
Degree Granting Institution:University of Illinois at Urbana-Champaign
dynamic network
latent space
weighted network
social network
edge attraction
missing data
Longitudinal data
community detection
Abstract:Dyadic data are ubiquitous and arise in the fields of biology, epidemiology, sociology, and many more. Such dyadic data are often best understood within the framework of networks. Network data can vary in many ways. For example, one might have binary or weighted networks, directed or undirected networks, and static or longitudinal networks. This last type of network, also called a dynamic network, is the focus of this work, with the goal of developing important tools and methodology for the analysis of dynamic networks. A general framework is developed for modeling dynamic networks via a latent space approach. Using a latent space approach to model such networks allows the researcher to model both the local and global structure of the network, inherently accounts for transitivity, and yields rich and meaningful visualization which can easily be interpreted for qualitative inference on the network. A Markov chain Monte Carlo (MCMC) estimation method within a Bayesian setting is presented. Several useful tools for the researcher arise from this estimation method. First, a method of predicting future relations, or edges, is given. Second, missing data can easily be incorporated into the model, obtaining a posterior probability of each missing edge. Third, a novel concept called nodal influence is introduced which describes how one actor can influence the edges of another actor. Detection of such nodal influence is given via computationally efficient posterior estimation. This model is shown to outperform the existing method, as well as being able to handle richer and more complex data than the existing method. The MCMC algorithm is made scalable by utilizing a log likelihood approximation proposed in the literature, slightly adapted to allow for missing data. Many of the dynamic networks that arise inherently have weighted edges. The latent space model is extended to handle a variety of types of weighted edges which arise. In particular, the model is extended to account for relational data that can be viewed as, conditioning on the latent actor positions, having come from an exponential family of distributions. An example is also given which demonstrates how, through data augmentation, a similar strategy can be employed when this is not the case. The log likelihood approximation method is then extended to make the MCMC algorithms scalable for weighted networks. Of particular interest is Newcomb's fraternity data, a network which captures the evolution and formation of a network beginning in its most nascent form and and ending at a stabilized form. The previous model is modified in two non-trivial ways; the first allows for the modeling of rank-order data, which does not fall into the broad categories of weighted network data given previously, and the second allows for the estimation of the evolution of the stability of the network. Next, it is shown how to use the uncertainties associated with the posterior estimation for subgroup detection and for determining the time at which these subgroups formed. Finally, the model parameters are used to find the association between individual stability and popularity. A longitudinal mixture model is described which can be used to make hard or soft clustering assignments for p-dimensional real valued data. This model accounts for temporal dependence of both the clustering assignment and the object to be clustered. Additionally, the model allows for covariates which may aid in explaining the clustering assignments. The solutions for implementing the generalized EM algorithm are presented. Recursive relationships are derived which allow the computational cost to grow linearly with time rather than exponentially. The latent space framework and the longitudinal clustering model are combined to perform community detection within dynamic network data, where the communities' characteristics are fixed but the membership of each community can evolve over time. This method can handle directed or undirected weighted dynamic network data. For community detection within directed or undirected binary networks, a novel model is given along with an efficient variational Bayes estimation algorithm. Both methods are shown to have better performance than using community detection methodology which does not borrow information across time.
Issue Date:2015-04-21
Rights Information:Copyright 2015 Daniel K Sewell
Date Available in IDEALS:2015-07-22
Date Deposited:May 2015

This item appears in the following Collection(s)

Item Statistics