criterion-0.3.0: Robust, reliable performance measurement and analysisSource codeContentsIndex
How to write benchmarks
Benchmarking IO actions
Benchmarking pure code
Fully evaluating a result
Constructing benchmarks
Running benchmarks
Other useful code
Wrappers for compiling and running benchmarks quickly and easily. See defaultMain below for an example.
class Benchmarkable a where
run :: a -> Int -> IO ()
data Benchmark
data B a = forall b . B (a -> b) a
bench :: Benchmarkable b => String -> b -> Benchmark
bgroup :: String -> [Benchmark] -> Benchmark
defaultMain :: [Benchmark] -> IO ()
defaultMainWith :: Config -> [Benchmark] -> IO ()
defaultOptions :: [OptDescr (IO Config)]
parseArgs :: Config -> [OptDescr (IO Config)] -> [String] -> IO (Config, [String])
How to write benchmarks

The Benchmarkable typeclass represents the class of all code that can be benchmarked. Every instance must run a benchmark a given number of times. We are most interested in benchmarking two things:

  • IO actions. Any IO action can be benchmarked directly.
  • Pure functions. GHC optimises aggressively when compiling with -O, so it is easy to write innocent-looking benchmark code that doesn't measure the performance of a pure function at all. We work around this by benchmarking both a function and its final argument together.
Benchmarking IO actions

Any IO action can be benchmarked easily if its type resembles this:

 IO a
Benchmarking pure code

Because GHC optimises aggressively when compiling with -O, it is potentially easy to write innocent-looking benchmark code that will only be evaluated once, for which all but the first iteration of the timing loop will be timing the cost of doing nothing.

To work around this, we provide two types for benchmarking pure code. The first is a specialised tuple:

 data B a = forall b. B (a -> b) a

The second is a specialised tuple named B:

 (a -> b, a)

As both of these types suggest, when you want to benchmark a function, you must supply two values:

  • The first element is the function, saturated with all but its last argument.
  • The second is the last argument to the function.

In practice, it is much easier to use the B tuple than a normal tuple. Using B, the type checker can see when the function type a -> b and its argument type a are the same, whereas code may require an explicit type annotation to make this connection explicit for a regular tuple. Here is an example that makes the distinction clearer. Suppose we want to benchmark the following function:

 firstN :: Int -> [Int]
 firstN k = take k [(0::Int)..]

So in the easy case, we construct a benchmark as follows:

 B firstN 1000

The compiler will correctly infer that the number 1000 must have the type Int, and the type of the expression is

 B [Int] Int

However, say we try to construct a benchmark using a tuple, as follows:

 (firstN, 1000)

Since we have written a numeric literal with no explicit type, the compiler will correctly infer a rather general type for this expression:

 (Num a) => (Int -> [Int], a)

This does not match the type (a -> b, a), so we would have to explicitly annotate the number 1000 as having the type Int for the typechecker to accept this as a valid benchmarkable expression.

Fully evaluating a result

The harness for evaluating a pure function only evaluates the result to weak head normal form (WHNF). If you need the result evaluated all the way to normal form, use the rnf function from the Control.Parallel.Strategies module to force its complete evaluation.

Using the firstN example from earlier, to naive eyes it appears that the following code ought to benchmark the production of the first 1000 list elements:

 B firstN 1000

Because the result is only forced until WHNF is reached, what this actually benchmarks is merely the production of the first list element! Here is a corrected version:

 B (rnf . firstN) 1000
class Benchmarkable a whereSource
A benchmarkable function or action.
:: aThe function or action to benchmark.
-> IntThe number of times to run or evaluate it.
-> IO ()
Run a function or action the specified number of times.
show/hide Instances
data Benchmark Source
A benchmark may consist of either a single Benchmarkable item with a name, created with bench, or a (possibly nested) group of Benchmarks, created with bgroup.
show/hide Instances
data B a Source
A container for a pure function to benchmark, and an argument to supply to it each time it is evaluated.
forall b . B (a -> b) a
show/hide Instances
Constructing benchmarks
:: Benchmarkable b
=> String
-> b
-> Benchmark
Create a single benchmark.
:: StringA name to identify the group of benchmarks.
-> [Benchmark]Benchmarks to group under this name.
-> Benchmark
Group several benchmarks together under a common name.
Running benchmarks
defaultMain :: [Benchmark] -> IO ()Source

An entry point that can be used as a main function.

 import Criterion.Main

 fib :: Int -> Int
 fib 0 = 0
 fib 1 = 1
 fib n = fib (n-1) + fib (n-2)

 main = defaultMain [
        bgroup "fib" [ bench "fib 10" $ B fib 10
                     , bench "fib 35" $ B fib 35
                     , bench "fib 37" $ B fib 37
defaultMainWith :: Config -> [Benchmark] -> IO ()Source

An entry point that can be used as a main function, with configurable defaults.


 import Criterion.Config
 import qualified Criterion.MultiMap as M

 myConfig = defaultConfig {
              -- Always display an 800x600 window with curves.
              cfgPlot = M.singleton KernelDensity (Window 800 600)
 main = defaultMainWith myConfig [
          bench "fib 30" $ B fib 30

If you save the above example as "Fib.hs", you should be able to compile it as follows:

 ghc -O --make Fib

Run "Fib --help" on the command line to get a list of command line options.

Other useful code
defaultOptions :: [OptDescr (IO Config)]Source
The standard options accepted on the command line.
parseArgs :: Config -> [OptDescr (IO Config)] -> [String] -> IO (Config, [String])Source
Parse command line options.
Produced by Haddock version 2.6.0