Copyright | [2009..2014] Trevor L. McDonell |
---|---|
License | BSD |
Safe Haskell | None |
Language | Haskell98 |
Kernel execution control for C-for-CUDA runtime interface
- type Fun = FunPtr ()
- data FunAttributes = FunAttributes {
- constSizeBytes :: !Int64
- localSizeBytes :: !Int64
- sharedSizeBytes :: !Int64
- maxKernelThreadsPerBlock :: !Int
- numRegs :: !Int
- data FunParam where
- data CacheConfig
- attributes :: Fun -> IO FunAttributes
- setConfig :: (Int, Int) -> (Int, Int, Int) -> Int64 -> Maybe Stream -> IO ()
- setParams :: [FunParam] -> IO ()
- setCacheConfig :: Fun -> CacheConfig -> IO ()
- launch :: Fun -> IO ()
- launchKernel :: Fun -> (Int, Int) -> (Int, Int, Int) -> Int64 -> Maybe Stream -> [FunParam] -> IO ()
Kernel Execution
A global
device function.
Note that the use of a string naming a function was deprecated in CUDA 4.1 and removed in CUDA 5.0.
data FunAttributes Source
FunAttributes | |
|
Kernel function parameters. Doubles will be converted to an internal float representation on devices that do not support doubles natively.
data CacheConfig Source
Cache configuration preference
attributes :: Fun -> IO FunAttributes Source
Obtain the attributes of the named global
device function. This
itemises the requirements to successfully launch the given kernel.
setParams :: [FunParam] -> IO () Source
Set the argument parameters that will be passed to the next kernel
invocation. This is used in conjunction with setConfig
to control kernel
execution.
setCacheConfig :: Fun -> CacheConfig -> IO () Source
On devices where the L1 cache and shared memory use the same hardware resources, this sets the preferred cache configuration for the given device function. This is only a preference; the driver is free to choose a different configuration as required to execute the function.
Switching between configuration modes may insert a device-side synchronisation point for streamed kernel launches
:: Fun | Device function symbol |
-> (Int, Int) | grid dimensions |
-> (Int, Int, Int) | thread block shape |
-> Int64 | shared memory per block (bytes) |
-> Maybe Stream | (optional) execution stream |
-> [FunParam] | |
-> IO () |
Invoke a kernel on a (gx * gy)
grid of blocks, where each block contains
(tx * ty * tz)
threads and has access to a given number of bytes of shared
memory. The launch may also be associated with a specific Stream
.