GPUOcelot
Public Types | Public Member Functions | Public Attributes | Protected Member Functions | Protected Attributes

executive::ExecutableKernel Class Reference

#include <ExecutableKernel.h>

Inheritance diagram for executive::ExecutableKernel:
Inheritance graph
[legend]
Collaboration diagram for executive::ExecutableKernel:
Collaboration graph
[legend]

List of all members.

Public Types

enum  CacheConfiguration { CacheConfigurationDefault, CachePreferShared, CachePreferL1, CacheConfiguration_invalid }
typedef std::vector
< trace::TraceGenerator * > 
TraceGeneratorVector
typedef std::vector< const
ir::Texture * > 
TextureVector

Public Member Functions

 ExecutableKernel (const ir::IRKernel &k, executive::Device *d=0)
 ExecutableKernel (executive::Device *d=0)
virtual ~ExecutableKernel ()
virtual bool executable () const
 Determines whether kernel is executable.
virtual void launchGrid (int width, int height, int depth)=0
 Launch a kernel on a 2D grid.
virtual size_t mapArgumentOffsets ()
 compute argument offsets for argument data
virtual void setArgumentBlock (const unsigned char *argument, size_t argumentSize)
 given a block of argument memory, sets the values of each argument
virtual size_t getArgumentBlock (unsigned char *argument, size_t maxSize) const
 gets the values of each argument as a block of binary data
virtual void setKernelShape (int x, int y, int z)=0
 Sets the shape of a kernel.
virtual void setExternSharedMemorySize (unsigned int)=0
 Changes the amount of external shared memory.
virtual void setCacheConfiguration (CacheConfiguration config)
 sets the cache configuration of the kernele
virtual CacheConfiguration getCacheConfiguration () const
 sets the cache configuration of the kernele
virtual void setWorkerThreads (unsigned int workerThreadLimit)=0
 Sets the max number of pthreads this kernel can use.
virtual void updateArgumentMemory ()=0
 Indicate that the kernels arguments have been updated.
virtual void updateMemory ()=0
 Indicate that other memory has been updated.
virtual TextureVector textureReferences () const =0
 Get a vector of all textures references by the kernel.
void traceEvent (const trace::TraceEvent &event) const
void tracePostEvent (const trace::TraceEvent &event) const
virtual void addTraceGenerator (trace::TraceGenerator *generator)
virtual void removeTraceGenerator (trace::TraceGenerator *generator)
virtual void setExternalFunctionSet (const ir::ExternalFunctionSet &s)=0
virtual void clearExternalFunctionSet ()=0
ir::ExternalFunctionSet::ExternalFunctionfindExternalFunction (const std::string &name) const
unsigned int constMemorySize () const
unsigned int localMemorySize () const
unsigned int globalLocalMemorySize () const
unsigned int maxThreadsPerBlock () const
unsigned int registerCount () const
unsigned int sharedMemorySize () const
unsigned int externSharedMemorySize () const
unsigned int totalSharedMemorySize () const
unsigned int argumentMemorySize () const
unsigned int parameterMemorySize () const
const ir::Dim3blockDim () const
const ir::Dim3gridDim () const

Public Attributes

executive::Devicedevice

Protected Member Functions

void initializeTraceGenerators ()
void finalizeTraceGenerators ()

Protected Attributes

unsigned int _constMemorySize
 Total amount of allocated constant memory size.
unsigned int _localMemorySize
 Total amount of allocated local memory size.
unsigned int _globalLocalMemorySize
 Total amount of allocated global local memory per thread.
unsigned int _maxThreadsPerBlock
 Maxmimum number of threads launched per block.
unsigned int _registerCount
 Number of registered required by each thread.
unsigned int _sharedMemorySize
 The amount of allocated static shared memory.
unsigned int _externSharedMemorySize
 The amount of allocated dynamic shared memory.
unsigned int _argumentMemorySize
 Total amount of packed parameter memory.
unsigned int _parameterMemorySize
 Kernel stack parameter memory space.
ir::Dim3 _blockDim
 The block dimensions.
ir::Dim3 _gridDim
 Dimension of grid in blocks.
TraceGeneratorVector _generators
 Attached trace generators.
const ir::ExternalFunctionSet_externals
 Registered external functions.
CacheConfiguration _cacheConfiguration
 configuration of cache

Member Typedef Documentation


Member Enumeration Documentation

Enumerator:
CacheConfigurationDefault 
CachePreferShared 
CachePreferL1 
CacheConfiguration_invalid 

Constructor & Destructor Documentation

executive::ExecutableKernel::ExecutableKernel ( const ir::IRKernel k,
executive::Device d = 0 
)
executive::ExecutableKernel::ExecutableKernel ( executive::Device d = 0)
executive::ExecutableKernel::~ExecutableKernel ( ) [virtual]

Member Function Documentation

void executive::ExecutableKernel::addTraceGenerator ( trace::TraceGenerator generator) [virtual]
unsigned int executive::ExecutableKernel::argumentMemorySize ( ) const
const ir::Dim3 & executive::ExecutableKernel::blockDim ( ) const
virtual void executive::ExecutableKernel::clearExternalFunctionSet ( ) [pure virtual]

clear the external function table for the emulated kernel

Implemented in executive::ATIExecutableKernel, executive::EmulatedKernel, executive::LLVMExecutableKernel, and executive::NVIDIAExecutableKernel.

unsigned int executive::ExecutableKernel::constMemorySize ( ) const

attribute accessors - things every executable kernel should know

bool executive::ExecutableKernel::executable ( ) const [virtual]

Determines whether kernel is executable.

Reimplemented from ir::IRKernel.

Reimplemented in executive::EmulatedKernel.

unsigned int executive::ExecutableKernel::externSharedMemorySize ( ) const
void executive::ExecutableKernel::finalizeTraceGenerators ( ) [protected]
ir::ExternalFunctionSet::ExternalFunction * executive::ExecutableKernel::findExternalFunction ( const std::string &  name) const

Find an external function

size_t executive::ExecutableKernel::getArgumentBlock ( unsigned char *  argument,
size_t  maxSize 
) const [virtual]

gets the values of each argument as a block of binary data

Parameters:
argumentpointer to argument memory
maxSizemaximum number of bytes to write to argument memory
Returns:
actual number of bytes required by argument memory
ExecutableKernel::CacheConfiguration executive::ExecutableKernel::getCacheConfiguration ( ) const [virtual]

sets the cache configuration of the kernele

unsigned int executive::ExecutableKernel::globalLocalMemorySize ( ) const
const ir::Dim3 & executive::ExecutableKernel::gridDim ( ) const
void executive::ExecutableKernel::initializeTraceGenerators ( ) [protected]
virtual void executive::ExecutableKernel::launchGrid ( int  width,
int  height,
int  depth 
) [pure virtual]
unsigned int executive::ExecutableKernel::localMemorySize ( ) const
size_t executive::ExecutableKernel::mapArgumentOffsets ( ) [virtual]

compute argument offsets for argument data

compute parameter offsets for parameter data

Returns:
number of bytes required for argument memory
unsigned int executive::ExecutableKernel::maxThreadsPerBlock ( ) const
unsigned int executive::ExecutableKernel::parameterMemorySize ( ) const
unsigned int executive::ExecutableKernel::registerCount ( ) const
void executive::ExecutableKernel::removeTraceGenerator ( trace::TraceGenerator generator) [virtual]
void executive::ExecutableKernel::setArgumentBlock ( const unsigned char *  argument,
size_t  argumentSize 
) [virtual]

given a block of argument memory, sets the values of each argument

Parameters:
argumentpointer to argument memory
argumentSizenumber of bytes to write to argument memory
void executive::ExecutableKernel::setCacheConfiguration ( ExecutableKernel::CacheConfiguration  config) [virtual]

sets the cache configuration of the kernele

virtual void executive::ExecutableKernel::setExternalFunctionSet ( const ir::ExternalFunctionSet s) [pure virtual]

sets an external function table for the emulated kernel

Implemented in executive::ATIExecutableKernel, executive::EmulatedKernel, executive::LLVMExecutableKernel, and executive::NVIDIAExecutableKernel.

virtual void executive::ExecutableKernel::setExternSharedMemorySize ( unsigned  int) [pure virtual]
virtual void executive::ExecutableKernel::setKernelShape ( int  x,
int  y,
int  z 
) [pure virtual]
virtual void executive::ExecutableKernel::setWorkerThreads ( unsigned int  workerThreadLimit) [pure virtual]

Sets the max number of pthreads this kernel can use.

Implemented in executive::ATIExecutableKernel, executive::EmulatedKernel, executive::LLVMExecutableKernel, and executive::NVIDIAExecutableKernel.

unsigned int executive::ExecutableKernel::sharedMemorySize ( ) const
virtual TextureVector executive::ExecutableKernel::textureReferences ( ) const [pure virtual]

Get a vector of all textures references by the kernel.

Implemented in executive::ATIExecutableKernel, executive::EmulatedKernel, executive::LLVMExecutableKernel, and executive::NVIDIAExecutableKernel.

unsigned int executive::ExecutableKernel::totalSharedMemorySize ( ) const
void executive::ExecutableKernel::traceEvent ( const trace::TraceEvent event) const

Notifies all attached TraceGenerators of an event

void executive::ExecutableKernel::tracePostEvent ( const trace::TraceEvent event) const

Notifies all attached TraceGenerators of completion of an event

virtual void executive::ExecutableKernel::updateArgumentMemory ( ) [pure virtual]

Indicate that the kernels arguments have been updated.

Implemented in executive::ATIExecutableKernel, executive::EmulatedKernel, executive::LLVMExecutableKernel, and executive::NVIDIAExecutableKernel.

virtual void executive::ExecutableKernel::updateMemory ( ) [pure virtual]

Member Data Documentation

Total amount of packed parameter memory.

The block dimensions.

configuration of cache

Total amount of allocated constant memory size.

Registered external functions.

The amount of allocated dynamic shared memory.

Attached trace generators.

Total amount of allocated global local memory per thread.

Dimension of grid in blocks.

Total amount of allocated local memory size.

Maxmimum number of threads launched per block.

Kernel stack parameter memory space.

Number of registered required by each thread.

The amount of allocated static shared memory.


The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines