bench-show-0.2.2: Show, plot and compare benchmark results

BenchShow

Description

BenchShow provides a DSL to quickly generate visual graphs or textual reports from benchmarking results file (CSV) produced by gauge or criterion. Reports or graphs can be formatted and presented in many useful ways. For example, we can prepare a graphical bar chart or column wise textual report comparing the performance of two packages or comparing the performance regression in a package caused by a particular change. Absolute or percentage difference between sets of benchmarks can be presented and sorted based on the difference. This allows us to easily identify the worst affected benchmarks and fix them. The presentation is quite flexible and a lot more interesting things can be done with it.

# Generating Graphs and Reports

The input is a CSV file generated by gauge --csv=results.csv or a similar output generated by criterion. The graph or the report function is invoked on the file with an appropriate Config to control various parameters of graph or report generation. In most cases defaultConfig should just do the job and a specific config may not be required.

# Fields, Groups and RunIds

In the documentation when we say field it means a benchmarking field e.g. time or maxrss. When we say group it means a group of benchmarks. An input file may have benchmark results collected from multiple runs. By default each run is designated as a single benchmark group with the group name default. Benchmark groups from different runs are distinguished using a runId which is the index of the run in the file, starting with 0.

Benchmarks can be classified into multiple groups using classifyBenchmark. Benchmarks from each run can be divided into multiple groups. In a multi-run input benchmark groups can be fully specified using the groupname (either default or as classified by classifyBenchmark) and the runId.

# Presentation

We can present the results in a textual format using report or as a graphical chart using graph. Each report consists of a number of benchmarks as rows and the columns can either be benchmarking fields or groups of benchmarks depending on the Presentation setting. In a graphical chart, we present multiple clusters, each cluster representing one column from the textual report, the rows (i.e. the benchmarks) are represented as bars in the cluster.

When the columns are groups, each report consists of results for a single benchmarking field for different benchmark groups. Using GroupStyle, we can further specify how we want to present the results the groups. We can either present absolute values of the field for each group or we can make the first group as a baseline and present differences from the baseline for the subsequent groups.

When the columns are fields, each report consists of results for a single benchmarking group. Fields cannot be compared like groups because they are of different types and have different measurement units.

The units in the report are automatically determined based on the minimum value in the range of values present. The ranges for fields can be overridden using fieldRanges.

# Mean and Max

In a raw benchmark file (--csvraw=results.csv with gauge) we may have data for multiple iterations of each benchmark. BenchShow combines results of all iterations depending on the field type. For example if the field is time it takes the mean of all iterations and if the field is maxrss it takes the maximum of all iterations.

# Tutorial and Examples

See the tutorial module BenchShow.Tutorial for sample charts and a comprehensive guide to generating reports and graphs. See the test directory for many usage examples, run the tests to see the charts generated by these tests.

Synopsis

# Documentation

How to show the results for multiple benchmark groups presented in columns or bar chart clusters.

Since: 0.2.0

Constructors

 Absolute Show absolute field values for all groups Diff Show baseline group values as usual and values for the subsequent groups as differences from the baseline Percent Show baseline group values as 100% and values for subsequent groups as a percentage of the baseline PercentDiff Show baseline group values as usual and values for subsequent groups as precentage difference from the baseline
Instances
 Source # Instance detailsDefined in BenchShow.Common Methods Source # Instance detailsDefined in BenchShow.Common MethodsshowList :: [GroupStyle] -> ShowS #

How to present the reports or graphs. Each report presents a number of benchmarks as rows, it may have, (1) a single column presenting the values for a single field, (2) multiple columns presenting values for different fields, or (3) multiple columns presenting values of the same field for different groups.

Since: 0.2.0

Constructors

 Solo Reports are generated for each group and for each field selected by the configuration. Each report presents benchmarks in a single group with a single column presenting a single field. If there are m fields and n groups selected by the configuration then a total of m x n reports are generated. Output files are named using -estimator-groupname-fieldname as suffix. Groups GroupStyle One report is generated for each field selected by the configuration. Each report presents a field with all the groups selected by the configuration as columns or clusters. Output files are named using -estimator-fieldname as suffix. Fields One report is generated for each group selected by the configuration. Each report presents a group with all the fields selected by the configuration as columns or clusters. Output files are named using -estimator-groupname as suffix.
Instances
 Source # Instance detailsDefined in BenchShow.Common Methods Source # Instance detailsDefined in BenchShow.Common MethodsshowList :: [Presentation] -> ShowS #

Additional annotations that can be optionally added to the title of the report or graph.

Since: 0.2.2

Constructors

 TitleField TitleEstimator TitleDiff
Instances
 Source # Instance detailsDefined in BenchShow.Common Methods Source # Instance detailsDefined in BenchShow.Common MethodsshowList :: [TitleAnnotation] -> ShowS #

data Estimator Source #

The statistical estimator used to arrive at a single value for a benchmark when samples from multiple experiments are available.

Since: 0.2.0

Constructors

 Median Report the median, outliers and outlier variance using box-plot method. This is the most robust indicator with respect to outliers when successive runs of benchmarks are compared. Mean Report the mean and the standard deviation from the mean. This is less robust than median but more precise. Regression Report the coefficient of regression, discarding the constant factor, arrived at by linear regression using ordinary least square method. The R-square goodness-of-fit estimate is also reported. It works better when larger number of samples are taken. This cannot be used when the number of samples is less than 2, in that case a mean value is reported instead.
Instances
 Source # Instance detailsDefined in BenchShow.Analysis Methods Source # Instance detailsDefined in BenchShow.Analysis MethodsshowList :: [Estimator] -> ShowS #

Strategy to compute the difference between two groups of benchmarks being compared.

Since: 0.2.0

Constructors

 SingleEstimator Use a single estimator to compute the difference between the baseline and the candidate. The estimator that is provided in the Config is used. MinEstimator Use Mean, Median and Regression estimators for both baseline and candidate, and report the estimator that shows the minimum difference. This is more robust against random variations.

When sorting and filtering the benchmarks using selectBenchmarks we can choose a column as a sort criterion. selectBenchmarks is provided with the data for the corresponding column which can be used for sorting the benchmarks. The column could be a group or a field depending on the Presentation.

Since: 0.2.0

Constructors

 ColumnIndex Int Specify the index of the sort column. Index 0 corresponds to the first value column. In a textual report, the very first column consists of benchmark names, therefore index 0 addresses the second column of the report. ColumnName (Either String (String, Int)) Specify the column using the name of the group or the field it represents, and the runId. When just the name is enough to uniquely identify the sort column the Left constructor can be used, otherwise the Right constructor is used which can use the runId to disambiguate. In a Fields presentation, just the field name is enough. In a Groups presentation, when there is a single benchmark run in the input file, just the group name is enough to identify the group, the runId defaults to 0. However, when there are multiple runs, a group needs to specify a runId as well.

data FieldTick Source #

FieldTick is used only in visual charts to generate the major ticks on the y-axis. You can specify either the size of a tick (TickSize) or the total number of ticks (TickCount).

Since: 0.2.0

Constructors

 TickSize Int Size of a tick, the unit is microseconds for time fields, and bytes for space fields. TickCount Int Total number of ticks in the range spread.

data Config Source #

Configuration governing generation of chart. See defaultConfig for the default values of these fields.

Since: 0.2.0

Constructors

 Config Fieldsverbose :: BoolProvide more details in the report, especially the standard deviation, outlier variance, R-square estimate and an annotation to indicate the actual method used when using MinEstimator are reported.outputDir :: Maybe FilePathThe directory where the output graph or report file should be placed.title :: Maybe StringReport title, more information like the plotted field name or the presentation style may be added to it.titleAnnotations :: [TitleAnnotation]Additional annotations to be added to the titlepresentation :: PresentationHow to determine the layout of the report or the chart.estimator :: EstimatorThe estimator used for the report.threshold :: WordThe minimum percentage difference between two runs of a benchmark beyond which the benchmark is flagged to have regressed or improved.diffStrategy :: DiffStrategyStrategy to compare two runs or groups of benchmarks.selectFields :: [String] -> [String]Filter and reorder the benchmarking fields. It is invoked with a list of all available benchmarking fields. Only those fields present in the output of this function are plotted and in that order.fieldRanges :: [(String, Double, Double)]The values in the tuple are (fieldName, RangeMin, RangeMax). Specify the min and max range of benchmarking fields. If the field value is outside the range it is clipped to the range limit. For time fields, the range values are in microseconds, and for space fields they are in bytes. The minimum of the range is used to determine the unit for the field.fieldTicks :: [(String, FieldTick)]The values in the tuple are (fieldName, tick). Specify the tick size of the fields to be used for the graphical reports.classifyBenchmark :: String -> Maybe (String, String)Filter, group and translate benchmark names. This function is invoked once for all benchmark names found in the results. It produces a tuple (groupname, benchname), where groupname is the name of the group the benchmark should be placed in, and benchname is the translated benchmark name to be used in the report. If it returns Nothing for a benchmark, that benchmark is omitted from the results.selectGroups :: [(String, Int)] -> [(String, Int)]Filter and reorder the benchmark group names. A benchmark group may be assigned using classifyBenchmark; when not assigned, all benchmarks are placed in the default group. The input to this function is a list of tuples with benchmark group names and the runIds. The output produced by this function is a filtered and reordered subset of the input. Only those benchmark groups present in the output are rendered and are presented in that order.selectBenchmarks :: (SortColumn -> Either String [(String, Double)]) -> [String]Filter and reorder benchmarks. selectBenchmarks is provided with a function which is invoked with a sorting column name or index, the function produces the benchmark names and corresponding values for that column which can be used as a sorting criterion. The output of selectBenchmarks is a list of benchmarks in the order in which they are to be rendered.

Default configuration. Use this as the base configuration and modify the required fields. The defaults are:

 verbose           = False
title             = Nothing
titleAnnotations  = [TitleField]
outputDir         = Nothing
presentation      = Groups Absolute
estimator         = Median
threshold         = 3
diffStrategy      = MinEstimator
selectFields      = filter (flip elem ["time", "mean", "maxrss"] . map toLower)
fieldRanges       = []
fieldTicks        = []
classifyBenchmark = Just . ("default",)
selectGroups      = id
selectBenchmarks  = f -> either error (map fst) \$ f (ColumnIndex 0)


Since: 0.2.0

Presents the benchmark results in a CSV input file as text reports according to the provided configuration. The first parameter is the input file name. The second parameter, when specified using Just, is the name prefix for the output SVG image file(s). One or more output files may be generated with the given prefix depending on the Presentation setting. When the second parameter is Nothing the reports are printed on the console. The last parameter is the configuration to customize the report, you can start with defaultConfig as the base and override any of the fields that you may want to change.

For example:

report "bench-results.csv" Nothing defaultConfig


Since: 0.2.0

graph :: FilePath -> FilePath -> Config -> IO () Source #

Presents the benchmark results in a CSV input file as graphical bar charts according to the provided configuration. The first parameter is the input file name, the second parameter is the name prefix for the output SVG image file(s). One or more output files may be generated depending on the Presentation setting. The last parameter is the configuration to customize the graph, you can start with defaultConfig as the base and override any of the fields that you may want to change.

For example:

graph "bench-results.csv" "output-graph" defaultConfig


Since: 0.2.0