GPUOcelot
|
Split all kernels in a module into sub-kernels. The sub-kernels should be called as functions from the main kernel. The assumption is that all threads will execute a sub-kernel, hit a barrier, and enter the next sub-kernel. More...
#include <SubkernelFormationPass.h>
Classes | |
class | ExtractKernelsPass |
Public Types | |
typedef std::vector < ir::PTXKernel * > | KernelVector |
Public Member Functions | |
SubkernelFormationPass (unsigned int expectedRegionSize=50) | |
void | runOnModule (ir::Module &m) |
Run the pass on a specific module. |
Split all kernels in a module into sub-kernels. The sub-kernels should be called as functions from the main kernel. The assumption is that all threads will execute a sub-kernel, hit a barrier, and enter the next sub-kernel.
This pass may optionally insert an explicit scheduler kernel that is responsible for doing fine-grained scheduling of the next function to execute and control transition between functions. This is necessary to support intelligent scheduling on architectures without runtime support (mainly GPUs).
typedef std::vector<ir::PTXKernel*> transforms::SubkernelFormationPass::KernelVector |
transforms::SubkernelFormationPass::SubkernelFormationPass | ( | unsigned int | expectedRegionSize = 50 | ) |
void transforms::SubkernelFormationPass::runOnModule | ( | ir::Module & | m | ) | [virtual] |
Run the pass on a specific module.
Implements transforms::ModulePass.