Degree Type

Dissertation

Date of Award

2012

Degree Name

Doctor of Philosophy

Department

Electrical and Computer Engineering

First Advisor

Joseph Zambreno

Abstract

This thesis proposes using an integrated hardware-software solution for improving Single-Instruction Multiple-Thread branching efficiency. Unlike current SIMT hardware branching architectures, this hardware-software solution allows programmers the ability to fine tune branching behavior for their application or allow the compiler to implement a generic software solution. To support a wide range of SIMT applications with different control flow properties, three branching methods are implemented in hardware with configurable software instructions. The three branching methods are the contemporary Immediate Post-Dominator Re-convergence that is currently implemented in SIMT processors, a proposed Hyper-threaded SIMT processor for maintaining statically allocated thread warps and a proposed Dynamic Micro-Kernels that modified thread warps during run-time execution. Each of the implemented branching methods have their strengths and weaknesses and result in different performance improvements depending on the application. SIMT hyper-threading turns a single SIMT processor core into multiple virtual processors. These virtual processors run divergent control flow paths in parallel with threads from the same warp. Controlling how the virtual processor cores are created is done using a per-warp stack that is managed through software instructions. Dynamic Micro-Kernels create new threads at run-time to execute divergent control flow paths instead of using branching instructions. A spawn instruction is used to create threads at run-time and once created are placed into new warps with similar threads following the same control flow path.

This thesis's integrated hardware-software branching architectures are evaluated using multiple realistic benchmarks with varying control flow divergence. Synthetic benchmarks are also used for evaluation and are designed to test specific branching conditions and isolate common branching behaviors. Each of the hardware implemented branching solutions are tested in isolation using different software algorithms. Results show improved performance for divergent applications and using different software algorithms will affect performance.

DOI

https://doi.org/10.31274/etd-180810-2363

Copyright Owner

Michael Steffen

Language

en

File Format

application/pdf

File Size

130 pages

Share

COinS