{-# LANGUAGE TypeFamilies, MultiParamTypeClasses #-}
module Control.Parallel.ScanVectorMachine.ScanVectorMachine where

-- | Scalar operations which may be performed on the elements of a
--   vector, either elementwise or in prefix-scan form.
data Op = And | Or | Min | Max | Plus | Times

--
-- | An instance of @ScanVectorMachine@ provides a scalar type @s@,
--   vectors of type @v s@ over that scalar of type, and the full
--   suite of Scan Vector Machine (SVM) operations (Blelloch'90,
--   page 60) on those vectors.  The SVM instruction set is sometimes
--   referred to as /VCODE/ (CMU tech report CMU-CS-90-146-R).
--
-- Only two changes have been made: (1) booleans are encoded as
-- scalars (zero is false, nonzero is true) and (2) Belloch's
-- elementwise subtraction has been replaced with a unary @neg@
-- operation; this way the set of elementwise and scan operations are
-- the same (subtraction is not associative).
--
-- Many of the names below overlap with those in the Prelude; we
-- recommend @import qualified ScanVectorMachine as SVM@ so that these
-- may be referred to as, for example, @SVM.length@.
--
-- Notice that there is no @map :: (s -> s) -> v s -> v s@; this is
-- essential to keeping /closures/ and /uncontained recursion/ out of the
-- parallel context.  See Blelloch 10.6.2 for the definition of
-- contained recursion.
--
-- Also notice that only three operations involve communication
-- between different parts of the paralell context: @distribute@,
-- @scan@, and @permute@.  The @distribute@ operation performs
-- broadcast communication from the serial context to the parallel
-- context.  The @scan@ operation performs prefix scans, which have
-- very efficient communication patterns (do a local scan, then a
-- global tree reduction, then a local distribution, then an
-- elementwise operation).  Only the @permute@ operation involves
-- complicated communication patterns.  This is mitigated to some
-- extent by the requirement that @permute@ must be a /permutation/ of
-- the vector; it is an error to send two elements to the same
-- destination index, or to have a destination index to which no
-- element is sent.
--
class ScanVectorMachine v s where

  -- | Scalar negation all of the elements of the vector.
  neg         ::       v s -> v s

  -- | Elementwise less-than-or-equal-to comparison.  Both vectors must be the same length.
  leq         ::       v s -> v s  -> v s       

  -- | Elementwise operations (see @Op@).  Both vectors must be the same length.
  op          :: Op -> v s -> v s -> v s        

  -- | Prefix scan operations (see @Op@).
  scan        :: Op -> v s -> v s               

  -- | If-then-else; @select b x y@ returns a vector whose @i@^th element is @if b[i] then x[i] else y[i]@.
  --   All three vectors must be the same length.
  select      ::       v s -> v s -> v s -> v s 

  -- | Permutation: @permute v1 v2@ returns a vector @v3@ where @v3[v2[i]] = v1[i]@ for all @i@.  Both vectors
  --   must be the same length and the elements of @v2@ must all be distinct, non-negative, and
  --   less than the lengths of the vectors.
  permute     ::       v s -> v s -> v s        

  -- | Replaces an element of a vector; @insert v s i e@ sets @i@^th element of the vector to @s@.  The scalar @i@ must be
  --   nonnegative and less than the length of the vector.  This instruction implements unicast communication from the
  --   serial context to the parallel context.
  insert      ::       v s -> s -> s -> v s     

  -- | Extracts an element of a vector; @extract v i@ yields @v[i]@.  The scalar @i@ must be nonnegative and less than
  --   the length of the vector.  This instruction implements communication from the parallel context to the serial context.
  extract     ::       v s -> s -> s            

  -- | Creates a new vector; @distribute s n@ creates a vector of length @n@ whose elements are all @s@.
  --   This instruction implements communication from the parallel context to the serial context.
  distribute  ::         s -> s -> v s          

  -- | Returns the length of a parallel vector.  These can be cached in the serial context since the length of a vector
  --   never depends on data from the paralell context; as a result @length@ does not actually involve communication.
  length      ::       v s -> s