Characterization and Transformation of Unstructured Control Flow in Bulk Synchronous GPU Applications

Characterization and Transformation of Unstructured Control Flow in Bulk Synchronous GPU Applications

Haicheng Wu, Gregory Diamos, Jin Wang, Si Li, and Sudhakar Yalamanchili. “Characterization and Transformation of Unstructured Control Flow in Bulk Synchronous GPU Applications.” International Journal of High Performance Computing Applications (JHPCA). February 2012.

Abstract

This paper identifies important classes of program control flow in applications targeted to commodity commercially available graphics processing units (GPUs) and characterize their presence in real workloads such as those that occur in CUDA and OpenCL. Broadly, control flow can be characterized as structured or unstructured. It is shown that most existing techniques for handling divergent control in bulk synchronous GPU applications handle structured control flow efficiently, some are incapable of executing unstructured control flow directly, and none handles unstructured control flow efficiently. An approach to reduce the impact of this problem is provided.

An unstructured-to-structured control flow transformation for CUDA kernels is implemented and its performance impact on a large class of GPU applications is assessed. The results quantify the importance of improving support for programs with unstructured control flow on GPUs. The transformation can also be used in a JIT compiler pass to execute programs with unstructured control flow on the GPU devices that do not support unstructured control flow. This is an important capability for execution portability of applications using GPU accelerators.

Download

Characterization and Transformation of Unstructured Control Flow in Bulk Synchronous GPU Applications [PDF]

Citation

@article{Wu:2012:CTU:2237840.2237843,
author = {Wu, Haicheng and Diamos, Gregory and Wang, Jin and Li, Si and Yalamanchili, Sudhakar},
title = {Characterization and transformation of unstructured control flow in bulk synchronous GPU applications},
journal = {Int. J. High Perform. Comput. Appl.},
issue_date = {May 2012},
volume = {26},
number = {2},
month = may,
year = {2012},
issn = {1094-3420},
pages = {170–185},
numpages = {16},
url = {http://dx.doi.org.prx.library.gatech.edu/10.1177/1094342011434814},
doi = {10.1177/1094342011434814},
acmid = {2237843},
publisher = {Sage Publications, Inc.},
address = {Thousand Oaks, CA, USA},
keywords = {GPU, branch divergence, unstructured control flow},
}