Degree Type

Dissertation

Date of Award

2014

Degree Name

Doctor of Philosophy

Department

Statistics

First Advisor

Mark S. Kaiser

Second Advisor

Daniel J. Nordman

Abstract

The statistical analysis of networks is a popular research topic with ever widening applications. In this work, we introduce a new class of models for network analysis, called local structure graph models (LSGMs). The approach specifies a network model through local features and allows for an interpretable and controllable local dependence structure. In particular, LSGMs are formulated by a set of full conditional distributions for each network edge, e.g., the probability of edge presence/absence, which depend functionally on neighborhoods or subcollections of other network edges. Hence, LSGMs correspond to a type of Markov Random Field (MRF) model applied to graph edges. The modeling features and interpretation of LSGMs are demonstrated through several numerical studies and illustrated through a network data example involving tornado occurrences. LSGMs are also shown to provide an alternate specification of another popular class of models for random graphs, belonging to exponential random graph models (ERGMs), which specify a model through a joint distribution on the entire collection of graph edges. An ERGM induces conditional distributions and neighborhoods, rather than explicitly defining them as in the LSGM approach. As one consequence of its conditional specification, LSGMs have the advantage of allowing direct control and separate interpretation of parameters influencing large-scale (e.g., marginal means) and small-scale (i.e., dependence) structures in a graph model. This is possible with LSGMs through so-called centered parameterizations of MRF models, which ERGMs are shown to lack. The centered parameterization and conditional specification of LSGMs further provide important advantages in graph modeling when incorporating covariate information from nodes, as illustrated with two further network data examples. However, the centered parameterization was developed for MRFs under an assumption of pairwise-only dependence, meaning that dependence is modeled between pairs of dependent edges only. This particular dependence structure may be inappropriate for modeling network data that exhibit transitivity or a prevalence of triangles within the network, which has been identified as an important feature of various networks. Consequently, the centered parameterization for MRFs is extended to account for triples of dependent edges in LSGMs. This extension then allows for the explicit modeling of transitivity in LSGMs, while retaining the same interpretable separation and control of large- and small-scale effects in a graph model and facilitating the use of covariate information. At the same time, the ability to model transitivity does not imply that this model feature should be commonly used or applied without cautious model diagnostics, which are currently lacking for graph models and for ERGMs in particular. By developing simulation-based model assessments for random graphs, we provide in-depth examinations and analyses of two commonly-used example networks, demonstrating that real network data may not, in fact, support the inclusion of transitivity in a graph model.

Copyright Owner

Emily Taylor Casleton

Language

en

File Format

application/pdf

File Size

155 pages

Share

COinS