accelerate: An embedded language for accelerated array processing

[ accelerate, bsd3, compilers-interpreters, concurrency, data, library, parallelism ] [ Propose Tags ]

Data.Array.Accelerate defines an embedded array language for computations for high-performance computing in Haskell. Computations on multi-dimensional, regular arrays are expressed in the form of parameterised collective operations, such as maps, reductions, and permutations. These computations may then be online compiled and executed on a range of architectures.

A simple example

As a simple example, consider the computation of a dot product of two vectors of floating point numbers:

dotp :: Acc (Vector Float) -> Acc (Vector Float) -> Acc (Scalar Float)
dotp xs ys = fold (+) 0 (zipWith (*) xs ys)

Except for the type, this code is almost the same as the corresponding Haskell code on lists of floats. The types indicate that the computation may be online-compiled for performance - for example, using Data.Array.Accelerate.LLVM.PTX it may be on-the-fly off-loaded to the GPU.

Additional components

The following supported add-ons are available as separate packages. Install them from Hackage with cabal install <package>

Examples and documentation

Haddock documentation is included in the package

The accelerate-examples package demonstrates a range of computational kernels and several complete applications, including:

lulesh-accelerate is an implementation of the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH) mini-app. LULESH represents a typical hydrodynamics code such as ALE3D, but is highly simplified and hard-coded to solve the Sedov blast problem on an unstructured hexahedron mesh.

Mailing list and contacts
Versions [faq] 0.4.0,,,,,,,,,,,,,,,,,,,,,,,,,,
Change log
Dependencies async (>=2.0), base (>=4.7 && <4.11), base-orphans (>=0.3), containers (>=0.3), deepseq (>=1.3), directory (>=1.0), ekg (>=0.1), ekg-core (>=0.1), exceptions (>=0.6), fclabels (>=2.0), filepath (>=1.0), ghc-prim, hashable (>=1.1), hashtables (>=1.0), mtl (>=2.0), pretty (>=1.0), template-haskell, text (>=1.0), time (>=1.4), transformers (>=0.3), unique, unix, unordered-containers (>=0.2), Win32 [details]
License BSD-3-Clause
Author Manuel M T Chakravarty, Robert Clifton-Everest, Gabriele Keller, Ben Lever, Trevor L. McDonell, Ryan Newtown, Sean Seefried
Maintainer Trevor L. McDonell <>
Revised Revision 1 made by TrevorMcDonell at 2017-05-24T04:41:54Z
Category Compilers/Interpreters, Concurrency, Data, Parallelism
Home page
Bug tracker
Source repo head: git clone git://
this: git clone git://
Uploaded by TrevorMcDonell at 2017-03-31T09:04:04Z
Distributions NixOS:
Downloads 27317 total (435 in the last 30 days)
Rating 2.5 (votes: 5) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Hackage Matrix CI
Docs available [build log]
Last success reported on 2017-03-31 [all 1 reports]



  • Data
    • Array
      • Data.Array.Accelerate
        • Data.Array.Accelerate.AST
        • Analysis
          • Data.Array.Accelerate.Analysis.Match
          • Data.Array.Accelerate.Analysis.Shape
          • Data.Array.Accelerate.Analysis.Stencil
          • Data.Array.Accelerate.Analysis.Type
        • Array
          • Data.Array.Accelerate.Array.Data
          • Data.Array.Accelerate.Array.Remote
            • Data.Array.Accelerate.Array.Remote.Class
            • Data.Array.Accelerate.Array.Remote.LRU
            • Data.Array.Accelerate.Array.Remote.Table
          • Data.Array.Accelerate.Array.Representation
          • Data.Array.Accelerate.Array.Sugar
          • Data.Array.Accelerate.Array.Unique
        • Data.Array.Accelerate.Async
        • Data
        • Data.Array.Accelerate.Debug
        • Data.Array.Accelerate.Error
        • Data.Array.Accelerate.FullList
        • Data.Array.Accelerate.Interpreter
        • Data.Array.Accelerate.Lifetime
        • Data.Array.Accelerate.Pretty
        • Data.Array.Accelerate.Product
        • Data.Array.Accelerate.Smart
        • Data.Array.Accelerate.Trafo
        • Data.Array.Accelerate.Type



Enable debug tracing messages. The following options are read from the environment variable ACCELERATE_FLAGS, and via the command-line as:

./program +ACC ... -ACC

Note that a backend may not implement (or be applicable to) all options.

The following flags control phases of the compiler. The are enabled with -f<flag> and can be reveresed with -fno-<flag>:

  • acc-sharing: Enable sharing recovery of array expressions (True).

  • exp-sharing: Enable sharing recovery of scalar expressions (True).

  • fusion: Enable array fusion (True).

  • simplify: Enable program simplification phase (True).

  • flush-cache: Clear any persistent caches on program startup (False).

  • fast-math: Allow algebraically equivalent transformations which may change floating point results (e.g., reassociate) (True).

The following options control debug message output, and are enabled with -d<flag>.

  • verbose: Be extra chatty.

  • dump-phases: Print timing information about each phase of the compiler. Enable GC stats (+RTS -t or otherwise) for memory usage information.

  • dump-sharing: Print information related to sharing recovery.

  • dump-simpl-stats: Print statistics related to fusion & simplification.

  • dump-simpl-iterations: Print a summary after each simplifier iteration.

  • dump-vectorisation: Print information related to the vectoriser.

  • dump-dot: Generate a representation of the program graph in Graphviz DOT format.

  • dump-simpl-dot: Generate a more compact representation of the program graph in Graphviz DOT format. In particular, scalar expressions are elided.

  • dump-gc: Print information related to the Accelerate garbage collector.

  • dump-gc-stats: Print aggregate garbage collection information at the end of program execution.

  • dubug-cc: Include debug symbols in the generated and compiled kernels.

  • dump-cc: Print information related to kernel code generation/compilation. Print the generated code if verbose.

  • dump-ld: Print information related to runtime linking.

  • dump-asm: Print information related to kernel assembly. Print the assembled code if verbose.

  • dump-exec: Print information related to program execution.

  • dump-sched: Print information related to execution scheduling.


Enable hooks for monitoring the running application using EKG. Implies debug mode. In order to view the metrics, your application will need to initialise the EKG server like so:

import Data.Array.Accelerate.Debug

import System.Metrics
import System.Remote.Monitoring

main :: IO ()
main = do
  store  <- initAccMetrics
  registerGcMetrics store      -- optional

  server <- forkServerWith store "localhost" 8000


Note that, as with any program utilising EKG, in order to collect Haskell GC statistics collection, you must either run the program with:


or compile it with:


Enable bounds checking


Enable bounds checking in unsafe operations


Enable internal consistency checks


Use -f <flag> to enable a flag, or -f -<flag> to disable that flag. More info


Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.

Maintainer's Corner

For package maintainers and hackage trustees