Degree Type

Dissertation

Date of Award

2021

Degree Name

Doctor of Philosophy

Department

Statistics

Major

Statistics

First Advisor

Zhengyuan Zhu

Abstract

This dissertation will focus on developing novel methods for model heterogeneity study and efficient consensus learning algorithms to analyze data over spatial networks.On the one hand, the recent development of remote sensing technologies has made it possible to collect complex and massive spatial data sets. These complex data motivate us to explore the model heterogeneity (i.e. find the model-based spatial clusters) over the spatial domain/network. In Chapter \ref{Chapter 2}, we first focus on the linear regression model and study the scenario where the location number is fixed and the local model is identifiable. We propose an adaptive spanning tree-based fusion lasso approach, which can simultaneously estimate the model and cluster the data sets over the spatial networks. It is shown that by simplifying the complex network topology to a tree structure, both the estimation and computation efficiency are significantly improved. Furthermore, in Chapter \ref{Chapter 3}, we extend the study to a spatial partial linear model and study the cases where each location only has one observation. In our model, a nonparametric intercept is adapted to absorb the spatial random effect and estimated by the bivariate spline over triangulation. Additionally, for the coefficient clustering part, we propose a novel forest lasso penalty, in which an adaptive clustering tree structure is constructed by the average of a multitude of initial random spanning trees. we show that the proposed fusion penalty could improve the estimation accuracy within a limited computation.

On the other hand, the massive volume of distributed dataset makes it hard to be analyzed with a centralized method.This fact encourages us to develop the distributed learning methods, which allow the computation to be implemented in the network. Although various network consensus learning algorithms have been proposed, the performances of the existing methods are not satisfactory in terms of communication efficiency and data privacy. In Chapter \ref{Chapter 4}-\ref{Chapter 5}, we focus on improving the communication performance of the network consensus learning methods. Inspired by the widely adopted compression method, we propose two differential-coded decentralized gradient descent algorithms, in which the sparsified/quantized message is communicated among the computation nodes for efficient communication. In Chapter \ref{Chapter 6}, to address the privacy concern, we design a privacy-preserving decentralized method under the framework of differential privacy. In our work, we invent a sparse differential Gaussian-Masking decentralized stochastic gradient descent algorithm. It is shown that by combining the Gaussian mechanism with sparsification, we gain a better privacy guarantee. Thorough numerical experiments have been conducted to verify the performance of our algorithms.

DOI

https://doi.org/10.31274/etd-20210609-207

Copyright Owner

Xin Zhang

Language

en

File Format

application/pdf

File Size

169 pages

Available for download on Wednesday, June 07, 2023

Share

COinS