# bench-show: Show, plot and compare benchmark results

Generate text reports and graphical charts from benchmark results generated by gauge or criterion, showing or comparing benchmarks in many useful ways. In a few lines of code, we can report time taken, peak memory usage, allocations, among many other fields; we can arrange benchmarks in groups and compare the groups; we can compare benchmark results before and after a change; we can show the difference from baseline in absolute terms or as a percentage; we can sort the results by percentage change to get the worst affected benchmarks.

bench-show helps us in answering questions like the following, visually or textually:

• Across two benchmark runs, show all the operations that resulted in a regression of more than 10%, so that we can quickly identify and fix performance problems in our application.

• Across two (or more) packages (providing similar functionality), show all the operations where the performance differs by more than 10%, so that we can critically analyze the packages and choose the right one.

Quick Start: Use gauge or criterion to generate a results.csv file, and then use the following code to generate a textual report or a graph:

report "results.csv"  Nothing defaultConfig
graph  "results.csv" "output" defaultConfig


For example, you can show can show % regression from a baseline in descending order texttually as follows:

(time)(Median)(Diff using min estimator)
Benchmark streamly(0)(μs)(base) streamly(1)(%)(-base)
--------- --------------------- ---------------------
zip                      644.33                +23.28
map                      653.36                 +7.65
fold                     639.96                -15.63


To show the same graphically:

See the README and the BenchShow.Tutorial module for comprehensive documentation.

## Reports and Charts

report with Fields presentation style generates a multi-column report. We can select many fields from a gauge raw report. Units of the fields are automatically determined based on the range of values:

report "results.csv" Nothing defaultConfig { presentation = Fields }

Benchmark     time(μs) maxrss(MiB)
------------- -------- -----------
vector/fold     641.62        2.75
streamly/fold   639.96        2.75
vector/map      638.89        2.72
streamly/map    653.36        2.66
vector/zip      651.42        2.58
streamly/zip    644.33        2.59


graph generates one bar chart per field:

graph "results.csv" "output" defaultConfig


When the input file contains results from a single benchmark run, by default all the benchmarks are placed in a single benchmark group named "default".

## Grouping

Let's write a benchmark classifier to put the streamly and vector benchmarks in their own groups:

   classifier name =
case splitOn "/" name of
grp : bench -> Just (grp, concat bench)
_          -> Nothing


Now we can show the two benchmark groups as separate columns. We can generate reports comparing different benchmark fields (e.g. time and maxrss) for all the groups:

   report "results.csv" Nothing
defaultConfig { classifyBenchmark = classifier }

(time)(Median)
Benchmark streamly(μs) vector(μs)
--------- ------------ ----------
fold            639.96     641.62
map             653.36     638.89
zip             644.33     651.42


We can do the same graphically as well, just replace report with graph in the code above. Each group is placed as a cluster on the graph. Multiple clusters are placed side by side (i.e. on the same scale) for easy comparison. For example:

## Regression, Percentage Difference and Sorting

We can append benchmarks results from multiple runs to the same file. These runs can then be compared. We can run benchmarks before and after a change and then report the regressions by percentage change in a sorted order:

Given a results file with two runs, this code generates the report that follows:

   report "results.csv" Nothing
defaultConfig
{ classifyBenchmark = classifier
, presentation = Groups PercentDiff
, selectBenchmarks = \f ->
reverse
$map fst$ sortBy (comparing snd)
$either error id$ f \$ ColumnIndex 1
}

(time)(Median)(Diff using min estimator)
Benchmark streamly(0)(μs)(base) streamly(1)(%)(-base)
--------- --------------------- ---------------------
zip                      644.33                +23.28
map                      653.36                 +7.65
fold                     639.96                -15.63


It tells us that in the second run the worst affected benchmark is zip taking 23.28 percent more time compared to the baseline.

Graphically:

## Contributions and Feedback

Contributions are welcome! Please see the TODO.md file or the existing issues if you want to pick up something to work on.

Any feedback on improvements or the direction of the package is welcome. You can always send an email to the maintainer or raise an issue for anything you want to suggest or discuss, or send a PR for any change that you would like to make.