Safe Haskell  None 

 reduceL :: Foldl i a b > (b > a > b) > StatefulWork i a b
 reduceLeftM :: Foldl i a b > (b > a > IO b) > StatefulWork i a b
 reduceR :: Foldr i a b > (a > b > b) > StatefulWork i a b
 reduceRightM :: Foldr i a b > (a > b > IO b) > StatefulWork i a b
 mutate :: Fill i a > (s > a > IO ()) > StatefulWork i a s
 imutate :: Fill i a > (s > i > a > IO ()) > StatefulWork i a s
 work :: (USource r l sh a, PreferredWorkIndex l sh i) => StatefulWork i a s > IO s > UArray r l sh a > IO s
 iwork :: USource r l sh a => StatefulWork sh a s > IO s > UArray r l sh a > IO s
 rangeWork :: USource r l sh a => StatefulWork sh a s > IO s > UArray r l sh a > sh > sh > IO s
 workP :: (USource r l sh a, PreferredWorkIndex l sh i) => Threads > StatefulWork i a s > IO s > (s > s > IO s) > UArray r l sh a > IO s
 iworkP :: USource r l sh a => Threads > StatefulWork sh a s > IO s > (s > s > IO s) > UArray r l sh a > IO s
 rangeWorkP :: USource r l sh a => Threads > StatefulWork sh a s > IO s > (s > s > IO s) > UArray r l sh a > sh > sh > IO s
 workOnSlicesSeparate :: (UVecSource r slr l sh v e, PreferredWorkIndex l sh i) => StatefulWork i e s > IO s > UArray r l sh (v e) > IO (VecList (Dim v) s)
 iworkOnSlicesSeparate :: UVecSource r slr l sh v e => StatefulWork sh e s > IO s > UArray r l sh (v e) > IO (VecList (Dim v) s)
 rangeWorkOnSlicesSeparate :: UVecSource r slr l sh v e => StatefulWork sh e s > IO s > UArray r l sh (v e) > sh > sh > IO (VecList (Dim v) s)
 workOnSlicesSeparateP :: (UVecSource r slr l sh v e, PreferredWorkIndex l sh i) => Threads > StatefulWork i e s > IO s > (s > s > IO s) > UArray r l sh (v e) > IO (VecList (Dim v) s)
 iworkOnSlicesSeparateP :: UVecSource r slr l sh v e => Threads > StatefulWork sh e s > IO s > (s > s > IO s) > UArray r l sh (v e) > IO (VecList (Dim v) s)
 rangeWorkOnSlicesSeparateP :: UVecSource r slr l sh v e => Threads > StatefulWork sh e s > IO s > (s > s > IO s) > UArray r l sh (v e) > sh > sh > IO (VecList (Dim v) s)
 type StatefulWork sh a s = IO s > (sh > IO a) > Work sh s
 type Foldl sh a b = (b > sh > a > IO b) > StatefulWork sh a b
 type Foldr sh a b = (sh > a > b > IO b) > StatefulWork sh a b
Fold combinators
See source of these 4 functions to construct more similar ones, if you need.
:: Foldl i a b 

> (b > a > b)  Pure left reduce 
> StatefulWork i a b  Result stateful work to be passed to work runners 
O(0)
:: Foldl i a b 

> (b > a > IO b)  Monadic left reduce 
> StatefulWork i a b  Result stateful work to be passed to work runners 
O(0)
:: Foldr i a b 

> (a > b > b)  Pure right reduce 
> StatefulWork i a b  Result stateful work to be passed to work runners 
O(0)
:: Foldr i a b 

> (a > b > IO b)  Monadic right reduce 
> StatefulWork i a b  Result stateful work to be passed to work runners 
O(0)
Combinators to work with mutable state
Added specially to improve performance of tasks like histogram filling.
Unfortunately, GHC doesn't figure that folding state isn't changed as ADT in such cases and doesn't lift it's evaluation higher from folding routine.
:: Fill i a 

> (s > a > IO ())  (state > array element > (state has changed))  State mutating function 
> StatefulWork i a s  Result stateful work to be passed to work runners 
O(0)
:: Fill i a 

> (s > i > a > IO ())  Indexed state mutating function 
> StatefulWork i a s  Result stateful work to be passed to work runners 
O(0) Version of mutate
, accepts mutating function
which additionaly accepts array index.
Work runners
:: (USource r l sh a, PreferredWorkIndex l sh i)  
=> StatefulWork i a s  Stateful working function 
> IO s  Monadic initial state (fold zero).
Wrap pure state in 
> UArray r l sh a  Source array 
> IO s  Final state (fold result) 
:: USource r l sh a  
=> StatefulWork sh a s  Stateful working function 
> IO s  Monadic initial state (fold zero).
Wrap pure state in 
> UArray r l sh a  Source array 
> sh  Topleft 
> sh  and bottomright corners of range to work in 
> IO s  Final state (fold result) 
O(n) Run stateful work in specified range of indices.
:: (USource r l sh a, PreferredWorkIndex l sh i)  
=> Threads  Number of threads to parallelize work on 
> StatefulWork i a s  Associative stateful working function 
> IO s  Monadic zero state.
Wrap pure state in 
> (s > s > IO s)  Associative monadic state joining function 
> UArray r l sh a  Source array 
> IO s  Gathered state (fold result) 
O(n) Run associative nonindexed stateful work in parallel.
Example  associative image histogram filling in the test: https://github.com/leventov/yarr/blob/master/tests/lumequalization.hs
:: USource r l sh a  
=> Threads  Number of threads to parallelize work on 
> StatefulWork sh a s  Associative stateful working function 
> IO s  Monadic zero state.
Wrap pure state in 
> (s > s > IO s)  Associative monadic state joining function 
> UArray r l sh a  Source array 
> IO s  Gathered state (fold result) 
O(n) Run associative indexed stateful work in parallel.
:: USource r l sh a  
=> Threads  Number of threads to parallelize work on 
> StatefulWork sh a s  Associative stateful working function 
> IO s  Monadic zero state.
Wrap pure state in 
> (s > s > IO s)  Associative monadic state joining function 
> UArray r l sh a  Source array 
> sh  Topleft 
> sh  and bottomright corners of range to work in 
> IO s  Gathered state (fold result) 
O(n) Run associative stateful work in specified range in parallel.
:: (UVecSource r slr l sh v e, PreferredWorkIndex l sh i)  
=> StatefulWork i e s  Stateful slicewise working function 
> IO s  Monadic initial state (fold zero).
Wrap pure state in 
> UArray r l sh (v e)  Source array of vectors 
> IO (VecList (Dim v) s)  Vector of final states (fold results) 
O(n) Run nonindexed stateful work over each slice of array of vectors.
:: UVecSource r slr l sh v e  
=> StatefulWork sh e s  Stateful slicewise working function 
> IO s  Monadic initial state (fold zero).
Wrap pure state in 
> UArray r l sh (v e)  Source array of vectors 
> IO (VecList (Dim v) s)  Vector of final states (fold results) 
O(n) Run indexed stateful work over each slice of array of vectors.
rangeWorkOnSlicesSeparateSource
:: UVecSource r slr l sh v e  
=> StatefulWork sh e s  Stateful slicewise working function 
> IO s  Monadic initial state (fold zero).
Wrap pure state in 
> UArray r l sh (v e)  Source array of vectors 
> sh  Topleft 
> sh  and bottomright corners of range to work in 
> IO (VecList (Dim v) s)  Vector of final states (fold results) 
O(n) Run stateful work in specified range over each slice of array of vectors.
:: (UVecSource r slr l sh v e, PreferredWorkIndex l sh i)  
=> Threads  Number of threads to parallelize work on 
> StatefulWork i e s  Stateful slicewise working function 
> IO s  Monadic zero state.
Wrap pure state in 
> (s > s > IO s)  Associative monadic state joining function 
> UArray r l sh (v e)  Source array of vectors 
> IO (VecList (Dim v) s)  Vector of gathered per slice results 
O(n) Run associative nonindexed stateful work over slices of array of vectors in parallel.
:: UVecSource r slr l sh v e  
=> Threads  Number of threads to parallelize work on 
> StatefulWork sh e s  Stateful slicewise working function 
> IO s  Monadic zero state.
Wrap pure state in 
> (s > s > IO s)  Associative monadic state joining function 
> UArray r l sh (v e)  Source array of vectors 
> IO (VecList (Dim v) s)  Vector of gathered per slice results 
O(n) Run associative indexed stateful work over slices of array of vectors in parallel.
rangeWorkOnSlicesSeparatePSource
:: UVecSource r slr l sh v e  
=> Threads  Number of threads to parallelize work on 
> StatefulWork sh e s  Stateful slicewise working function 
> IO s  Monadic zero state.
Wrap pure state in 
> (s > s > IO s)  Associative monadic state joining function 
> UArray r l sh (v e)  Source array of vectors 
> sh  Topleft 
> sh  and bottomright corners of range to work in 
> IO (VecList (Dim v) s)  Vector of gathered per slice results 
O(n) Run associative stateful work in specified range over slices of array of vectors in parallel.
Aliases for work types
type StatefulWork sh a sSource
= IO s  Initial state 
> (sh > IO a)  Indexing function 
> Work sh s  Curried result function  worker, emits final state 
Generalizes both partially applied left and right folds, as well as works on mutable state.
To be passed to fold runners from Data.Yarr.Work module.
= (b > sh > a > IO b)  Generalized left reduce 
> StatefulWork sh a b  Curried result stateful work 
Generalizes left to right folds.
To be passed to fold combinators from Data.Yarr.Work module.
= (sh > a > b > IO b)  Generalized right reduce 
> StatefulWork sh a b  Curried result stateful work 
Generalizes right to left folds.
To be passed to fold combinators from Data.Yarr.Work module.