cuda- FFI binding to the CUDA interface for programming NVIDIA GPUs

Copyright[2009..2018] Trevor L. McDonell
Safe HaskellNone




Kernel execution control for C-for-CUDA runtime interface


Kernel Execution

type Fun = FunPtr () Source #

A global device function.

Note that the use of a string naming a function was deprecated in CUDA 4.1 and removed in CUDA 5.0.

data FunAttributes Source #




data FunParam where Source #

Kernel function parameters. Doubles will be converted to an internal float representation on devices that do not support doubles natively.


IArg :: !Int -> FunParam 
FArg :: !Float -> FunParam 
DArg :: !Double -> FunParam 
VArg :: Storable a => !a -> FunParam 

attributes :: Fun -> IO FunAttributes Source #

Obtain the attributes of the named global device function. This itemises the requirements to successfully launch the given kernel.

setConfig Source #


:: (Int, Int)

grid dimensions

-> (Int, Int, Int)

block dimensions

-> Int64

shared memory per block (bytes)

-> Maybe Stream

associated processing stream

-> IO () 

Specify the grid and block dimensions for a device call. Used in conjunction with setParams, this pushes data onto the execution stack that will be popped when a function is launched.

setParams :: [FunParam] -> IO () Source #

Set the argument parameters that will be passed to the next kernel invocation. This is used in conjunction with setConfig to control kernel execution.

setCacheConfig :: Fun -> CacheConfig -> IO () Source #

On devices where the L1 cache and shared memory use the same hardware resources, this sets the preferred cache configuration for the given device function. This is only a preference; the driver is free to choose a different configuration as required to execute the function.

Switching between configuration modes may insert a device-side synchronisation point for streamed kernel launches

launch :: Fun -> IO () Source #

Invoke the global kernel function on the device. This must be preceded by a call to setConfig and (if appropriate) setParams.

launchKernel Source #


:: Fun

Device function symbol

-> (Int, Int)

grid dimensions

-> (Int, Int, Int)

thread block shape

-> Int64

shared memory per block (bytes)

-> Maybe Stream

(optional) execution stream

-> [FunParam] 
-> IO () 

Invoke a kernel on a (gx * gy) grid of blocks, where each block contains (tx * ty * tz) threads and has access to a given number of bytes of shared memory. The launch may also be associated with a specific Stream.