GPUOcelot
|
A class for a pass that removes all barriers from a PTX kernel. More...
#include <RemoveBarrierPass.h>
Public Member Functions | |
RemoveBarrierPass (unsigned int kernelId=0, const ir::ExternalFunctionSet *externals=0) | |
void | initialize (const ir::Module &m) |
Initialize the pass using a specific module. | |
void | runOnKernel (ir::IRKernel &k) |
Run the pass on a specific kernel in the module. | |
void | finalize () |
Finalize the pass. | |
Public Attributes | |
bool | usesBarriers |
A class for a pass that removes all barriers from a PTX kernel.
This implementation leaves identifies barriers and splits the basic block containing them into two. The first block contains all of the code before the barrier, spill instructions to a stack in local memory, and a tail call to resume this kernel. A local variable is allocated on the stack to indicate the program entry point.
The program entry point is augmented to include a conditinal branch to the second block of each split barrier depending on the program entry point variable. The second block is augmented with code to load the live variables from the local memory stack.
transforms::RemoveBarrierPass::RemoveBarrierPass | ( | unsigned int | kernelId = 0 , |
const ir::ExternalFunctionSet * | externals = 0 |
||
) |
void transforms::RemoveBarrierPass::finalize | ( | ) | [virtual] |
Finalize the pass.
Implements transforms::KernelPass.
void transforms::RemoveBarrierPass::initialize | ( | const ir::Module & | m | ) | [virtual] |
Initialize the pass using a specific module.
Implements transforms::KernelPass.
void transforms::RemoveBarrierPass::runOnKernel | ( | ir::IRKernel & | k | ) | [virtual] |
Run the pass on a specific kernel in the module.
Implements transforms::KernelPass.