Campus Units

Electrical and Computer Engineering, Mathematics

Document Type

Article

Publication Version

Submitted Manuscript

Publication Date

2019

Journal or Book Title

arXiv

Abstract

Distributed matrix computations (matrix-vector and matrix-matrix multiplications) are at the heart of several tasks within the machine learning pipeline. However, distributed clusters are well-recognized to suffer from the problem of stragglers (slow or failed nodes). Prior work in this area has presented straggler mitigation strategies based on polynomial evaluation/interpolation. However, such approaches suffer from numerical problems (blow up of round-off errors) owing to the high condition numbers of the corresponding Vandermonde matrices. In this work, we introduce a novel solution approach that relies on embedding distributed matrix computations into the structure of a convolutional code. This simple innovation allows us to develop a provably numerically robust and efficient (fast) solution for distributed matrix-vector and matrix-matrix multiplication.

Comments

This is a pre-print of the article Das, Anindya B., Aditya Ramamoorthy, and Namrata Vaswani. "Random Convolutional Coding for Robust and Straggler Resilient Distributed Matrix Computation." arXiv preprint arXiv:1907.08064 (2019). Posted with permission.

Copyright Owner

The Authors

Language

en

File Format

application/pdf

Published Version

Share

COinS