GPUOcelot
Modules | Functions

Execution Control

CUDA Driver API
Collaboration diagram for Execution Control:

Modules

 Execution Control [DEPRECATED]

Functions

CUresult CUDAAPI cuFuncSetBlockShape (CUfunction hfunc, int x, int y, int z)
 Sets the block-dimensions for the function.
CUresult CUDAAPI cuFuncSetSharedSize (CUfunction hfunc, unsigned int bytes)
 Sets the dynamic shared-memory size for the function.
CUresult CUDAAPI cuFuncGetAttribute (int *pi, CUfunction_attribute attrib, CUfunction hfunc)
 Returns information about a function.
CUresult CUDAAPI cuFuncSetCacheConfig (CUfunction hfunc, CUfunc_cache config)
 Sets the preferred cache configuration for a device function.
CUresult CUDAAPI cuParamSetSize (CUfunction hfunc, unsigned int numbytes)
 Sets the parameter size for the function.
CUresult CUDAAPI cuParamSeti (CUfunction hfunc, int offset, unsigned int value)
 Adds an integer parameter to the function's argument list.
CUresult CUDAAPI cuParamSetf (CUfunction hfunc, int offset, float value)
 Adds a floating-point parameter to the function's argument list.
CUresult CUDAAPI cuParamSetv (CUfunction hfunc, int offset, void *ptr, unsigned int numbytes)
 Adds arbitrary data to the function's argument list.
CUresult CUDAAPI cuLaunch (CUfunction f)
 Launches a CUDA function.
CUresult CUDAAPI cuLaunchGrid (CUfunction f, int grid_width, int grid_height)
 Launches a CUDA function.
CUresult CUDAAPI cuLaunchGridAsync (CUfunction f, int grid_width, int grid_height, CUstream hStream)
 Launches a CUDA function.

Detailed Description

This section describes the execution control functions of the low-level CUDA driver application programming interface.


Function Documentation

CUresult CUDAAPI cuFuncGetAttribute ( int *  pi,
CUfunction_attribute  attrib,
CUfunction  hfunc 
)

Returns information about a function.

Returns in *pi the integer value of the attribute attrib on the kernel given by hfunc. The supported attributes are:

  • CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK: The maximum number of threads per block, beyond which a launch of the function would fail. This number depends on both the function and the device on which the function is currently loaded.
  • CU_FUNC_ATTRIBUTE_SHARED_SIZE_BYTES: The size in bytes of statically-allocated shared memory per block required by this function. This does not include dynamically-allocated shared memory requested by the user at runtime.
  • CU_FUNC_ATTRIBUTE_CONST_SIZE_BYTES: The size in bytes of user-allocated constant memory required by this function.
  • CU_FUNC_ATTRIBUTE_LOCAL_SIZE_BYTES: The size in bytes of local memory used by each thread of this function.
  • CU_FUNC_ATTRIBUTE_NUM_REGS: The number of registers used by each thread of this function.
  • CU_FUNC_ATTRIBUTE_PTX_VERSION: The PTX virtual architecture version for which the function was compiled. This value is the major PTX version * 10 + the minor PTX version, so a PTX version 1.3 function would return the value 13. Note that this may return the undefined value of 0 for cubins compiled prior to CUDA 3.0.
  • CU_FUNC_ATTRIBUTE_BINARY_VERSION: The binary architecture version for which the function was compiled. This value is the major binary version * 10 + the minor binary version, so a binary version 1.3 function would return the value 13. Note that this will return a value of 10 for legacy cubins that do not have a properly-encoded binary architecture version.
Parameters:
pi- Returned attribute value
attrib- Attribute requested
hfunc- Function to query attribute of
Returns:
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_HANDLE, CUDA_ERROR_INVALID_VALUE
See also:
cuFuncSetBlockShape, cuFuncSetSharedSize, cuFuncSetCacheConfig, cuParamSetSize, cuParamSeti, cuParamSetf, cuParamSetv, cuLaunch, cuLaunchGrid, cuLaunchGridAsync
CUresult CUDAAPI cuFuncSetBlockShape ( CUfunction  hfunc,
int  x,
int  y,
int  z 
)

Sets the block-dimensions for the function.

Specifies the x, y, and z dimensions of the thread blocks that are created when the kernel given by hfunc is launched.

Parameters:
hfunc- Kernel to specify dimensions of
x- X dimension
y- Y dimension
z- Z dimension
Returns:
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_HANDLE, CUDA_ERROR_INVALID_VALUE
See also:
cuFuncSetSharedSize, cuFuncSetCacheConfig, cuFuncGetAttribute, cuParamSetSize, cuParamSeti, cuParamSetf, cuParamSetv, cuLaunch, cuLaunchGrid, cuLaunchGridAsync
CUresult CUDAAPI cuFuncSetCacheConfig ( CUfunction  hfunc,
CUfunc_cache  config 
)

Sets the preferred cache configuration for a device function.

On devices where the L1 cache and shared memory use the same hardware resources, this sets through config the preferred cache configuration for the device function hfunc. This is only a preference. The driver will use the requested configuration if possible, but it is free to choose a different configuration if required to execute hfunc. Any context-wide preference set via cuCtxSetCacheConfig() will be overridden by this per-function setting unless the per-function setting is CU_FUNC_CACHE_PREFER_NONE. In that case, the current context-wide setting will be used.

This setting does nothing on devices where the size of the L1 cache and shared memory are fixed.

Launching a kernel with a different preference than the most recent preference setting may insert a device-side synchronization point.

The supported cache configurations are:

Parameters:
hfunc- Kernel to configure cache for
config- Requested cache configuration
Returns:
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT
See also:
cuCtxGetCacheConfig, cuCtxSetCacheConfig, cuFuncSetBlockShape, cuFuncGetAttribute, cuParamSetSize, cuParamSeti, cuParamSetf, cuParamSetv, cuLaunch, cuLaunchGrid, cuLaunchGridAsync
CUresult CUDAAPI cuFuncSetSharedSize ( CUfunction  hfunc,
unsigned int  bytes 
)

Sets the dynamic shared-memory size for the function.

Sets through bytes the amount of dynamic shared memory that will be available to each thread block when the kernel given by hfunc is launched.

Parameters:
hfunc- Kernel to specify dynamic shared-memory size for
bytes- Dynamic shared-memory size per thread in bytes
Returns:
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_HANDLE, CUDA_ERROR_INVALID_VALUE
See also:
cuFuncSetBlockShape, cuFuncSetCacheConfig, cuFuncGetAttribute, cuParamSetSize, cuParamSeti, cuParamSetf, cuParamSetv, cuLaunch, cuLaunchGrid, cuLaunchGridAsync
CUresult CUDAAPI cuLaunch ( CUfunction  f)
CUresult CUDAAPI cuLaunchGrid ( CUfunction  f,
int  grid_width,
int  grid_height 
)

Launches a CUDA function.

Invokes the kernel f on a grid_width x grid_height grid of blocks. Each block contains the number of threads specified by a previous call to cuFuncSetBlockShape().

Parameters:
f- Kernel to launch
grid_width- Width of grid in blocks
grid_height- Height of grid in blocks
Returns:
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_LAUNCH_FAILED, CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES, CUDA_ERROR_LAUNCH_TIMEOUT, CUDA_ERROR_LAUNCH_INCOMPATIBLE_TEXTURING, CUDA_ERROR_SHARED_OBJECT_INIT_FAILED
See also:
cuFuncSetBlockShape, cuFuncSetSharedSize, cuFuncGetAttribute, cuParamSetSize, cuParamSetf, cuParamSeti, cuParamSetv, cuLaunch, cuLaunchGridAsync
CUresult CUDAAPI cuLaunchGridAsync ( CUfunction  f,
int  grid_width,
int  grid_height,
CUstream  hStream 
)

Launches a CUDA function.

Invokes the kernel f on a grid_width x grid_height grid of blocks. Each block contains the number of threads specified by a previous call to cuFuncSetBlockShape().

cuLaunchGridAsync() can optionally be associated to a stream by passing a non-zero hStream argument.

Parameters:
f- Kernel to launch
grid_width- Width of grid in blocks
grid_height- Height of grid in blocks
hStream- Stream identifier
Returns:
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_LAUNCH_FAILED, CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES, CUDA_ERROR_LAUNCH_TIMEOUT, CUDA_ERROR_LAUNCH_INCOMPATIBLE_TEXTURING, CUDA_ERROR_SHARED_OBJECT_INIT_FAILED
See also:
cuFuncSetBlockShape, cuFuncSetSharedSize, cuFuncGetAttribute, cuParamSetSize, cuParamSetf, cuParamSeti, cuParamSetv, cuLaunch, cuLaunchGrid
CUresult CUDAAPI cuParamSetf ( CUfunction  hfunc,
int  offset,
float  value 
)

Adds a floating-point parameter to the function's argument list.

Sets a floating-point parameter that will be specified the next time the kernel corresponding to hfunc will be invoked. offset is a byte offset.

Parameters:
hfunc- Kernel to add parameter to
offset- Offset to add parameter to argument list
value- Value of parameter
Returns:
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE
See also:
cuFuncSetBlockShape, cuFuncSetSharedSize, cuFuncGetAttribute, cuParamSetSize, cuParamSeti, cuParamSetv, cuLaunch, cuLaunchGrid, cuLaunchGridAsync
CUresult CUDAAPI cuParamSeti ( CUfunction  hfunc,
int  offset,
unsigned int  value 
)

Adds an integer parameter to the function's argument list.

Sets an integer parameter that will be specified the next time the kernel corresponding to hfunc will be invoked. offset is a byte offset.

Parameters:
hfunc- Kernel to add parameter to
offset- Offset to add parameter to argument list
value- Value of parameter
Returns:
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE
See also:
cuFuncSetBlockShape, cuFuncSetSharedSize, cuFuncGetAttribute, cuParamSetSize, cuParamSetf, cuParamSetv, cuLaunch, cuLaunchGrid, cuLaunchGridAsync
CUresult CUDAAPI cuParamSetSize ( CUfunction  hfunc,
unsigned int  numbytes 
)

Sets the parameter size for the function.

Sets through numbytes the total size in bytes needed by the function parameters of the kernel corresponding to hfunc.

Parameters:
hfunc- Kernel to set parameter size for
numbytes- Size of parameter list in bytes
Returns:
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE
See also:
cuFuncSetBlockShape, cuFuncSetSharedSize, cuFuncGetAttribute, cuParamSetf, cuParamSeti, cuParamSetv, cuLaunch, cuLaunchGrid, cuLaunchGridAsync
CUresult CUDAAPI cuParamSetv ( CUfunction  hfunc,
int  offset,
void *  ptr,
unsigned int  numbytes 
)

Adds arbitrary data to the function's argument list.

Copies an arbitrary amount of data (specified in numbytes) from ptr into the parameter space of the kernel corresponding to hfunc. offset is a byte offset.

Parameters:
hfunc- Kernel to add data to
offset- Offset to add data to argument list
ptr- Pointer to arbitrary data
numbytes- Size of data to copy in bytes
Returns:
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE
See also:
cuFuncSetBlockShape, cuFuncSetSharedSize, cuFuncGetAttribute, cuParamSetSize, cuParamSetf, cuParamSeti, cuLaunch, cuLaunchGrid, cuLaunchGridAsync
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines