-- Hoogle documentation, generated by Haddock -- See Hoogle, http://www.haskell.org/hoogle/ -- | An implementation of the Scan Vector Machine instruction set in Haskell -- -- An implementation of the Scan Vector Machine instruction set in -- Haskell @package scan-vector-machine @version 0.2.7 module Control.Parallel.ScanVectorMachine.ScanVectorMachine -- | Scalar operations which may be performed on the elements of a vector, -- either elementwise or in prefix-scan form. data Op And :: Op Or :: Op Min :: Op Max :: Op Plus :: Op Times :: Op -- | An instance of ScanVectorMachine provides a scalar type -- s, vectors of type v s over that scalar of type, and -- the full suite of Scan Vector Machine (SVM) operations (Blelloch'90, -- page 60) on those vectors. The SVM instruction set is sometimes -- referred to as VCODE (CMU tech report CMU-CS-90-146-R). -- -- Only two changes have been made: (1) booleans are encoded as scalars -- (zero is false, nonzero is true) and (2) Belloch's elementwise -- subtraction has been replaced with a unary neg operation; -- this way the set of elementwise and scan operations are the same -- (subtraction is not associative). -- -- Many of the names below overlap with those in the Prelude; we -- recommend import qualified ScanVectorMachine as SVM so that -- these may be referred to as, for example, SVM.length. -- -- Notice that there is no map :: (s -> s) -> v s -> v -- s; this is essential to keeping closures and -- uncontained recursion out of the parallel context. See Blelloch -- 10.6.2 for the definition of contained recursion. -- -- Also notice that only three operations involve communication between -- different parts of the paralell context: distribute, -- scan, and permute. The distribute operation -- performs broadcast communication from the serial context to the -- parallel context. The scan operation performs prefix scans, -- which have very efficient communication patterns (do a local scan, -- then a global tree reduction, then a local distribution, then an -- elementwise operation). Only the permute operation involves -- complicated communication patterns. This is mitigated to some extent -- by the requirement that permute must be a permutation -- of the vector; it is an error to send two elements to the same -- destination index, or to have a destination index to which no element -- is sent. class ScanVectorMachine v s neg :: ScanVectorMachine v s => v s -> v s leq :: ScanVectorMachine v s => v s -> v s -> v s op :: ScanVectorMachine v s => Op -> v s -> v s -> v s scan :: ScanVectorMachine v s => Op -> v s -> v s select :: ScanVectorMachine v s => v s -> v s -> v s -> v s permute :: ScanVectorMachine v s => v s -> v s -> v s insert :: ScanVectorMachine v s => v s -> s -> s -> v s extract :: ScanVectorMachine v s => v s -> s -> s distribute :: ScanVectorMachine v s => s -> s -> v s length :: ScanVectorMachine v s => v s -> s -- | A crude implementation of the ScanVectorMachine class using -- Data.Array.IArray; no parallelism. Warning: outrageously -- inefficient code ahead! module Control.Parallel.ScanVectorMachine.SerialScanVectorMachine data SSVM e instance (Enum e, Ix e, Ord e, Num e) => ScanVectorMachine SSVM e instance (Ix e, Show e) => Show (SSVM e) -- | An instance of SegmentedScanVectorMachine provides a scalar -- type s, a vector type v, and a segmented vector -- (vector-of-vectors) type v' such that v implements -- the SVM operations over s and v' implements -- the SVM operations over v s. -- -- This file contains a default instance for ScanVectorMachine V' (V -- S), given an instance ScanVectorMachine V S. In other -- words, given an implementation of vectors-of-scalars, this will -- produce an implementation of vectors-of-vectors-of-scalars. -- -- This new type V' provides SVM operations over -- vectors-of-vectors-of-scalars; from the perspective of V', -- the vectors-of-scalars are called segments. Notice that -- V' uses vectors-of-scalars wherever ordinary scalars were -- previously used. For example, when the length operation is -- applied to a vector-of-vectors the result is not a scalar, but rather -- a vector-of-scalars giving the lengths of each of the segments. This -- phenomenon is crucial to the replication theorem and flattening -- transformation. -- -- It turns out that V' is basically (,) -- but this is -- not exposed to the user. Blelloch outlines three encodings (figure -- 4.2): head-flags, length, and head-pointer. The implementation below -- uses the length style since it can represent zero-length -- vectors efficiently. -- -- It is sometimes advantageous for hardware/platform providers to -- implement vectors-of-vectors-of-scalars directly (see -- NestedVectors.hs for the reasoning). To do this, implement -- the class SegmentedScanVectorMachine below. module Control.Parallel.ScanVectorMachine.SegmentedScanVectorMachine class (ScanVectorMachine v s, ScanVectorMachine v' (v' (v s))) => SegmentedScanVectorMachine v' v s instance ScanVectorMachine v s => ScanVectorMachine SegVec (v s) -- | Given an instance of ScanVectorMachine V' (V S), we can -- produce a type V'' and instance ScanVectorMachine V'' (V' -- (V S)). In other words, given an implementation of vectors with -- some nonzero nesting depth, this will produce an implementation with -- nesting depth one level deeper. -- -- This is different from SegmentedVectors, which uses flat -- vectors (0-deep nesting) to emulate segmented vectors (1-deep nesting) -- by cutting the size of the scalars in half. Here, there is no need to -- assume that the flat-vector scalars are twice as wide (in terms of -- bits) as the segmented scalars, so arbitrarily deep nesting may be -- achieved without sacrificing any additional bit-width. In addition, -- NestedVectors introduces less overhead than -- SegmentedVectors. For this reason, many hardware/platform -- providers choose to implement ScanVectorMachine V' (V S) -- instead of ScanVectorMachine (V S); this requires more work -- (more methods to implement), but eliminates the overhead of -- SegmentedVectors. module Control.Parallel.ScanVectorMachine.NestedVectors data VecPair v VecPair :: v -> v -> VecPair v check_eq :: t -> t1 -> t instance (ScanVectorMachine v s, ScanVectorMachine v' (v s)) => ScanVectorMachine VecPair (v' (v s)) -- | An instance -- --
-- instance Num s => SVM.ScanVectorMachine ([::]) s ---- -- ... demonstrating that the parallel arrays [:s:] of Data -- Parallel Haskell support the SVM operations. In truth this is a bit -- backward: DPH is a high-level nested data parallel language which -- ought to compile down to something like SVM. Unfortunately -- DPH's mapP allows closures and uncontained recursion into the -- parallel context, so this isn't possible. module Control.Parallel.ScanVectorMachine.DataParallelHaskellSVM -- | An instance -- --
-- instance Accelerate.IsScalar s => -- SVM.ScanVectorMachine (Accelerate.Array Accelerate.DIM1) s ---- -- demonstrating that the Data.Array.Accelerate library for GPU -- computation is able to support the SVM operations module Control.Parallel.ScanVectorMachine.AccelerateSVM instance IsScalar s => ScanVectorMachine (Array DIM1) s