This package defines the Fold, NonemptyFold, and EffectfulFold types and
provides an assortment of ways to construct, combine, and use them.
Every gambler knows that the secret to surviving
Is knowing what to throw away and knowing what to keep
You got to know when to hold 'em, know when to fold 'em
Know when to walk away, and know when to run
— The Gambler by Don Schlitz, popularized by Kenny Rogers
Intro to Fold
The foldl' function in the base package is used when we want a strictly
evaluated result from traversing a list.
foldl' :: Foldable t => (b -> a -> b) -> b -> t a -> b
For example, to sum a list of numbers:
λ> import qualified Data.List as List
λ> List.foldl' (+) 0 [1..100]
5050
What if we put the first two parameters to List.foldl' into a datatype?
data Fold a b = Fold
{ initial :: b
, step :: b -> a -> b }
Or, better yet, we can use a trick to turn the datatype into a Functor (which
will become important when we discuss the Applicative a bit later):
data Fold a b = forall x. Fold
{ initial :: x
, step :: x -> a -> x
, extract :: x -> b }
We can then express the concept of numeric summation as:
sum :: Num a => Fold a a
sum = Fold{ initial = 0, step = (+), extract = id }
This Fold can be used to sum lists and other Foldable collections, but it
can also be used to sum effectful streams. So even without any further
mechanism, just having this datatype gives us some useful expressive power.
There is no need for each streaming library to duplicate all the work of
defining its own copies of sum, product, all, any, and, or,
minimum, maximum, etc.; a library that provides some kind of Stream type
needs only define a function to apply a fold to a stream ...
foldStream :: Fold a b -> Stream m a -> m b
... and then users can make use of any library of folds that they may find or
concoct. gambler itself contains much of the functionality of the standard
Data.List module, but there are more things in heaven and earth than are
dreamt of in this package.
Intro to NonemptyFold
There are some kinds of folding that only work if the input it nonempty.
Suppose, for example, we want the greatest of all the items. If there are no
items, there is no greatest item. We express this sort of thing with a slight
modification to Fold:
data NonemptyFold a b = forall x. NonemptyFold
{ initial :: a -> x
, step :: x -> a -> x
, extract :: x -> b }
The only thing that's different is the type of the initial field has changed
from x to a -> x; it is now parameterized on the first item.
The notion of selecting greatest item can now be expressed as:
maximum = NonemptyFold{ initial = id, step = max, extract = id }
A NonemptyFold can be converted to a Fold using Fold.Pure.nonemptyFold.
The conversion changes the fold's return type from b to Maybe b to
accommodate the possibility of empty input.
Intro to EffectfulFold
There is a related function in base that does the same thing as foldl' but
in a monadic context:
foldM :: Foldable t => Monad m => (b -> a -> m b) -> b -> t a -> m b
This allows us to perform effects as we fold.
λ> import qualified Control.Monad as Monad
λ> Monad.foldM (\x a -> putStrLn ("* " <> show a) $> (x + a)) 0 [1..5]
* 1
* 2
* 3
* 4
* 5
15
The type we define corresponding to the arguments of Monad.foldM is:
data EffectfulFold m a b = forall x. EffectfulFold
{ initial :: m x
, step :: x -> a -> m x
, extract :: x -> m b }
A regular Fold can be converted to an EffectfulFold of any monad using
Fold.Effectful.fold.
The Applicative instances
The Fold and EffectfulFold applicatives are great for computing multiple folds
over a collection in one pass over the data. For example, suppose that you want
to compute both the sum and the length of a list. The following approach works,
but it uses space inefficiently:
import qualified Data.List as List
sumAndLength :: Num a => [a] -> (a, Natural)
sumAndLength xs = (List.sum xs, List.genericLength xs)
The problem is this goes over the list in two passes. If you demand the result
of sum, the Haskell runtime will materialize the entire list. However, the
runtime cannot garbage collect the list because the list is still required for
the call to length. The space requirement of sumAndLength is therefore
linear with respect to the size of the list. We can do much better.
With gambler, we can instead write:
import qualified Fold
sumAndLength :: Num a => [a] -> (a, Natural)
sumAndLength = Fold.runFold $ (,) <$> Fold.sum <*> Fold.length
This achieves the same result using constant space.
Quick start
To get quickly playing around with gambler, launch GHCi using cabal:
cabal repl --build-depends gambler
The example from the previous section can be run as follows:
λ> import qualified Fold
λ> Fold.runFold ((,) <$> Fold.sum <*> Fold.length) [1..1000000]
(500000500000,1000000)
This gambler package is mostly a copy of foldl, with some features removed
to minimize its dependency set. What remains in gambler is essentially the
same as what can be found in foldl version 1.4.13, subject only to
reorganization, renaming, and minor modifications.
Future plans
Once the Foldable1 class has been added to base, the type of
Fold.Nonempty.run may be generalized to accommodate it.