accelerate-cuda-0.15.0.0: Accelerate backend for NVIDIA GPUs

Copyright[2013..2014] Manuel M T Chakravarty, Gabriele Keller, Trevor L. McDonell, Robert Clifton-Everest
LicenseBSD3
MaintainerRobert Clifton-Everest <robertce@cse.unsw.edu.au>
Stabilityexperimental
Portabilitynon-portable (GHC extensions)
Safe HaskellNone
LanguageHaskell98

Data.Array.Accelerate.CUDA.Foreign

Contents

Description

 

Synopsis

Functions callable from foreign code

accelerateCreate :: Device -> ForeignContext -> IO AccHandle Source

Create an Accelerate handle given a device and a cuda context.

AccHandle accelerateCreate(int device, CUcontext ctx);

accelerateDestroy :: AccHandle -> IO () Source

Releases all resources used by the accelerate library.

void accelerateDestroy(AccHandle hndl);

freeOutput :: Ptr OutputArray -> IO () Source

Function callable from foreign code to free a OutputArray returned after executing an Accelerate computation.

Once freed, the device pointers associated with an array are no longer valid.

void freeOutput(OutputArray arr);

freeProgram :: StablePtr a -> IO () Source

Free a compiled accelerate program.

void freeProgram(Program prg);

Exporting

exportAfun :: Name -> String -> Q [Dec] Source

Given the Name of an Accelerate function (a function of type ''Acc a -> Acc b'') generate a a function callable from foreign code with the second argument specifying it's name.

buildExported :: forall a b. (Arrays a, Arrays b) => AccHandle -> (Acc a -> Acc b) -> IO (StablePtr Afun) Source

Given a handle and an Accelerate function, generate an exportable version.

Types

type InputArray = (ShapeBuffer, DevicePtrBuffer) Source

The input required from foreign code.

type OutputArray = (ShapeBuffer, DevicePtrBuffer, StablePtr EArray) Source

A result array from an accelerate program.

type ShapeBuffer = Ptr CInt Source

A foreign buffer that represents a shape as an array of ints.

type DevicePtrBuffer = Ptr WordPtr Source

A buffer of device pointers

Backend representation

data CUDAForeignAcc as bs where Source

CUDA foreign Acc functions are just CIO functions.

Constructors

CUDAForeignAcc :: String -> (as -> CIO bs) -> CUDAForeignAcc as bs 

Instances

Foreign CUDAForeignAcc 
Typeable (* -> * -> *) CUDAForeignAcc 

canExecuteAcc :: (Foreign f, Typeable as, Typeable bs) => f as bs -> Maybe (as -> CIO bs) Source

Gives the executable form of a foreign function if it can be executed by the CUDA backend.

data CUDAForeignExp x y where Source

CUDA foreign Exp functions consist of a list of C header files necessary to call the function and the name of the function to call.

Constructors

CUDAForeignExp :: IsScalar y => [String] -> String -> CUDAForeignExp x y 

Instances

Foreign CUDAForeignExp 
Typeable (* -> * -> *) CUDAForeignExp 

canExecuteExp :: forall f x y. (Foreign f, Typeable y, Typeable x) => f x y -> Maybe ([String], String) Source

Gives the foreign function name as a string if it is a foreign Exp function for the CUDA backend.

Manipulating arrays

type family DevicePtrs e :: * Source

Instances

type DevicePtrs Bool = DevicePtr Word8 
type DevicePtrs Char = DevicePtr Char 
type DevicePtrs Double = DevicePtr Double 
type DevicePtrs Float = DevicePtr Float 
type DevicePtrs Int = DevicePtr Int 
type DevicePtrs Int8 = DevicePtr Int8 
type DevicePtrs Int16 = DevicePtr Int16 
type DevicePtrs Int32 = DevicePtr Int32 
type DevicePtrs Int64 = DevicePtr Int64 
type DevicePtrs Word = DevicePtr Word 
type DevicePtrs Word8 = DevicePtr Word8 
type DevicePtrs Word16 = DevicePtr Word16 
type DevicePtrs Word32 = DevicePtr Word32 
type DevicePtrs Word64 = DevicePtr Word64 
type DevicePtrs () = () 
type DevicePtrs CChar = DevicePtr HTYPE_CCHAR 
type DevicePtrs CSChar = DevicePtr Int8 
type DevicePtrs CUChar = DevicePtr Word8 
type DevicePtrs CShort = DevicePtr Int16 
type DevicePtrs CUShort = DevicePtr Word16 
type DevicePtrs CInt = DevicePtr Int32 
type DevicePtrs CUInt = DevicePtr Word32 
type DevicePtrs CLong = DevicePtr HTYPE_LONG 
type DevicePtrs CULong = DevicePtr HTYPE_UNSIGNED_LONG 
type DevicePtrs CLLong = DevicePtr Int64 
type DevicePtrs CULLong = DevicePtr Word64 
type DevicePtrs CFloat = DevicePtr Float 
type DevicePtrs CDouble = DevicePtr Double 
type DevicePtrs (a, b) = (DevicePtrs a, DevicePtrs b) 

devicePtrsOfArray :: Array sh e -> CIO (DevicePtrs (EltRepr e)) Source

Get the raw CUDA device pointers associated with an array

indexArray :: (Shape dim, Elt e) => Array dim e -> Int -> CIO e Source

Read a single element from an array at the given row-major index. This is a synchronous operation.

useArray :: (Shape dim, Elt e) => Array dim e -> CIO () Source

Upload an existing array to the device

useArrayAsync :: (Shape dim, Elt e) => Array dim e -> Maybe Stream -> CIO () Source

peekArray :: (Shape dim, Elt e) => Array dim e -> CIO () Source

peekArrayAsync :: (Shape dim, Elt e) => Array dim e -> Maybe Stream -> CIO () Source

pokeArray :: (Shape dim, Elt e) => Array dim e -> CIO () Source

pokeArrayAsync :: (Shape dim, Elt e) => Array dim e -> Maybe Stream -> CIO () Source

copyArray :: (Shape dim, Elt e) => Array dim e -> Array dim e -> CIO () Source

Copy data between two device arrays. The operation is asynchronous with respect to the host, but will never overlap kernel execution.

copyArrayAsync :: (Shape dim, Elt e) => Array dim e -> Array dim e -> Maybe Stream -> CIO () Source

allocateArray :: (Shape dim, Elt e) => dim -> CIO (Array dim e) Source

newArray :: (Shape sh, Elt e) => sh -> (sh -> e) -> CIO (Array sh e) Source

Running IO actions in an Accelerate context

data CIO a Source

Instances

liftIO :: MonadIO m => forall a. IO a -> m a

Lift a computation from the IO monad.

inContext :: Context -> IO a -> IO a Source

Run an IO action within the given Acclerate context

inDefaultContext :: IO a -> IO a Source

Run an IO action in the default Acclerate context