map-reduce-folds-0.1.0.5: foldl wrappers for map-reduce

Control.MapReduce.Engines

Description

Types and functions used by all the engines. Notes:

1. The provided grouping functions group elements into a Seq as this is a good default choice.
2. The Streamly engine is the fastest in my benchmarks. It's the engine used by default if you import Control.MapReduce.Simple.
3. All the engines take a grouping function as a parameter and default ones are provided. For simple map/reduce, the grouping step may be the bottleneck and I wanted to leave room for experimentation. I've tried (and failed!) to find anything faster than using Map or HashMap via toList . fromListWith (<>).
Synopsis

# Fold Types

type MapReduceFold y k c q x d = Unpack x y -> Assign k y c -> Reduce k c d -> Fold x (q d) Source #

Type-alias for a map-reduce-fold engine

type MapReduceFoldM m y k c q x d = UnpackM m x y -> AssignM m k y c -> ReduceM m k c d -> FoldM m x (q d) Source #

Type-alias for a monadic (effectful) map-reduce-fold engine

# Engine Helpers

reduceFunction :: (Foldable h, Functor h) => Reduce k x d -> k -> h x -> d Source #

Turn Reduce into a function we can apply

reduceFunctionM :: (Traversable h, Monad m) => ReduceM m k x d -> k -> h x -> m d Source #

Turn ReduceM into a function we can apply

# groupBy Helpers

fromListWithHT :: forall h k v s. (HashTable h, Eq k, Hashable k) => (v -> v -> v) -> [(k, v)] -> ST s (h s k v) Source #

an implementation of fromListWith for mutable hashtables from the hastables package. Basically a copy fromList from that package using mutate instead of insert to apply the given function if the was already in the map. Might not be the ideal implementation. Notes:

• This function is specific hashtable agnostic so you'll have to supply a specific implementation from the package via TypeApplication
• This function returns the hash-table in the ST monad. You can fold over it (using foldM from hashtables) and then use runST to get the grouped structure out.