Computing SpMV on FPGAs

Townsend, Kevin

Computing SpMV on FPGAs

File

Townsend_iastate_0097E_15534.pdf (1.35 MB)

Date

2016-01-01

Authors

Townsend, Kevin

Advisor

Joseph Zambreno

Altmetrics

Organizational Units

Organizational Unit

Electrical and Computer Engineering

The Department of Electrical and Computer Engineering (ECpE) contains two focuses. The focus on Electrical Engineering teaches students in the fields of control systems, electromagnetics and non-destructive evaluation, microelectronics, electric power & energy systems, and the like. The Computer Engineering focus teaches in the fields of software systems, embedded systems, networking, information security, computer architecture, etc.

History
The Department of Electrical Engineering was formed in 1909 from the division of the Department of Physics and Electrical Engineering. In 1985 its name changed to Department of Electrical Engineering and Computer Engineering. In 1995 it became the Department of Electrical and Computer Engineering.

Dates of Existence
1909-present

Historical Names

Department of Electrical Engineering (1909-1985)
Department of Electrical Engineering and Computer Engineering (1985-1995)

Related Units

College of Engineering (parent college)
Department of Physics and Electrical Engineering (predecessor)

Department

Electrical and Computer Engineering

Abstract

There are hundreds of papers on accelerating sparse matrix vector multiplication (SpMV), however, only a handful target FPGAs. Some claim that FPGAs inherently perform inferiorly to CPUs and GPUs. FPGAs do perform inferiorly for some applications like matrix-matrix multiplication and matrix-vector multiplication. CPUs and GPUs have too much memory bandwidth and too much floating point computation power for FPGAs to compete. However, the low computations to memory operations ratio and irregular memory access of SpMV trips up both CPUs and GPUs. We see this as a leveling of the playing field for FPGAs.

Our implementation focuses on three pillars: matrix traversal, multiply-accumulator design, and matrix compression. First, most SpMV implementations traverse the matrix in row-major order, but we mix column and row traversal. Second, To accommodate the new traversal the multiply accumulator stores many intermediate y values. Third, we compress the matrix to increase the transfer rate of the matrix from RAM to the FPGA. Together these pillars enable our SpMV implementation to perform competitively with CPUs and GPUs.

Copyright

Fri Jan 01 00:00:00 UTC 2016

Collections

Theses and Dissertations

Full item page