generic-random- Generic random generators

Safe HaskellNone



Generic implementations of QuickCheck's arbitrary.


Define your type.

data Tree a = Leaf a | Node (Tree a) (Tree a)
  deriving Generic

Pick an arbitrary implementation, specifying the required distribution of data constructors.

instance Arbitrary a => Arbitrary (Tree a) where
  arbitrary = genericArbitrary (8 % 9 % ())

arbitrary :: Gen (Tree a) picks a Leaf with probability 9/17, or a Node with probability 8/17, and recursively fills their fields with arbitrary.

For Tree, genericArbitrary produces code equivalent to the following:

genericArbitrary :: Arbitrary a => Weights (Tree a) -> Gen (Tree a)
genericArbitrary (x % y % ()) =
    [ (x, Leaf <$> arbitrary)
    , (y, Node <$> arbitrary <*> arbitrary)

Distribution of constructors

The distribution of constructors can be specified as a special list of weights in the same order as the data type definition. This assigns to each constructor a probability proportional to its weight; in other words, p_C = weight_C / sumOfWeights.

The list of weights is built up with the (%) operator as a cons, and using the unit () as the empty list, in the order corresponding to the data type definition. The uniform distribution can be obtained with uniform.

Uniform distribution

You can specify the uniform distribution (all weights equal) with uniform. (genericArbitraryU is available as a shorthand for genericArbitrary uniform.)

Note that for many recursive types, a uniform distribution tends to produce big or even infinite values.

Typed weights

GHC 8.0.1 and above only (base ≥ 4.9).

The weights actually have type W "ConstructorName" (just a newtype around Int), so that you can annotate a weight with its corresponding constructor, and it will be checked that you got the order right.

This will type-check.

((x :: W "Leaf") % (y :: W "Node") % ()) :: Weights (Tree a)
(x % (y :: W "Node") % ()) :: Weights (Tree a)

This will not.

((x :: W "Node") % y % ()) :: Weights (Tree a)
-- Requires an order of constructors different from the definition of the Tree type.

(x % y % z % ()) :: Weights (Tree a)
-- Doesn't have the right number of weights.

Ensuring termination

As mentioned earlier, one must be careful with recursive types to avoid producing extremely large values.

The alternative generator genericArbitrary' implements a simple strategy to keep values at reasonable sizes: the size parameter of Gen is divided among the fields of the chosen constructor. When it reaches zero, the generator selects a small term of the given type. This generally ensures that the number of constructors remains close to the initial size parameter passed to Gen.

genericArbitrary' (x1 % ... % xn % ())

Here is an example with nullary constructors:

data Bush = Leaf1 | Leaf2 | Node3 Bush Bush Bush
  deriving Generic

instance Arbitrary Bush where
  arbitrary = genericArbitrary' (1 % 2 % 3 % ())

Here, genericArbitrary' is equivalent to:

genericArbitrary' :: Weights Bush -> Gen Bush
genericArbitrary' (x % y % z % ()) =
  sized $ \n ->
    if n == 0 then
      -- If the size parameter is zero, only nullary alternatives are kept.
      elements [Leaf1, Leaf2]
        [ (x, return Leaf1)
        , (y, return Leaf2)
        , (z, resize (n `div` 3) node)  -- 3 because Node3 is 3-ary
    node = Node3 <$> arbitrary <*> arbitrary <*> arbitrary

If we want to generate a value of type Tree (), there is a value of depth 1 that we can use to end recursion: Leaf ().

genericArbitrary' :: Weights (Tree ()) -> Gen (Tree ())
genericArbitrary' (x % y % ()) =
  sized $ \n ->
    if n == 0 then
      return (Leaf ())
        [ (x, Leaf <$> arbitrary)
        , (y, resize (n `div` 2) $ Node <$> arbitrary <*> arbitrary)

Because the argument of Tree must be inspected in order to discover values of type Tree (), we incur some extra constraints if we want polymorphism.

{-# LANGUAGE FlexibleContexts, UndecidableInstances #-}

instance (Arbitrary a, BaseCase (Tree a))
  => Arbitrary (Tree a) where
  arbitrary = genericArbitrary' (1 % 2 % ())

By default, the BaseCase type class looks for all values of minimal depth (constructors have depth 1 + max(0, depths of fields)).

This can easily be overriden by declaring a specialized BaseCase instance, such as this one:

instance Arbitrary a => BaseCase (Tree a) where
  baseCase = oneof [leaf, simpleNode]
      leaf = Leaf <$> arbitrary
      simpleNode = Node <$> leaf <*> leaf

An alternative base case can also be specified directly in the arbitrary definition with the withBaseCase combinator.

genericArbitraryRec is a variant of genericArbitrary' with no base case.

instance Arbitrary Bush where
  arbitrary =
    genericArbitraryRec (1 % 2 % 3 % ())
      `withBaseCase` return Leaf1

Custom generators for some fields

Sometimes, a few fields may need custom generators instead of arbitrary. For example, imagine here that String is meant to represent alphanumerical strings only, and that IDs are meant to be nonnegative, whereas balances can have any sign.

data User = User {
  userName :: String,
  userId :: Int,
  userBalance :: Int
  } deriving Generic
  • Arbitrary String may generate any unicode characters, alphanumeric or not;
  • Arbitrary Int may generate negative values;
  • using newtype wrappers or passing generators explicitly to properties may be impractical (the maintenance overhead can be high because the types are big or change often).

Using generic-random, the alternative is to declare a (heterogeneous) list of generators to be used when generating certain fields...

customGens :: GenList '[Field "userId" Int, String]
customGens =
  (Field . getNonNegative <$> arbitrary) :@
  (listOf (elements (filter isAlphaNum [minBound .. maxBound]))) :@

And to use the genericArbitraryG and variants that accept those explicit generators.

  • All String fields will use the provided generator of alphanumeric strings;
  • the field "userId" of type Int will use the generator of nonnegative integers (the Field type is special);
  • everything else defaults to arbitrary.
instance Arbitrary User where
  arbitrary = genericArbitrarySingleG customGens