bench-show-0.2.2: Show, plot and compare benchmark results

Copyright(c) 2017-18 Composewell Technologies
LicenseBSD3
Maintainerharendra.kumar@gmail.com
Stabilityexperimental
PortabilityGHC
Safe HaskellNone
LanguageHaskell2010

BenchShow

Description

BenchShow provides a DSL to quickly generate visual graphs or textual reports from benchmarking results file (CSV) produced by gauge or criterion. Reports or graphs can be formatted and presented in many useful ways. For example, we can prepare a graphical bar chart or column wise textual report comparing the performance of two packages or comparing the performance regression in a package caused by a particular change. Absolute or percentage difference between sets of benchmarks can be presented and sorted based on the difference. This allows us to easily identify the worst affected benchmarks and fix them. The presentation is quite flexible and a lot more interesting things can be done with it.

Generating Graphs and Reports

The input is a CSV file generated by gauge --csv=results.csv or a similar output generated by criterion. The graph or the report function is invoked on the file with an appropriate Config to control various parameters of graph or report generation. In most cases defaultConfig should just do the job and a specific config may not be required.

Fields, Groups and RunIds

In the documentation when we say field it means a benchmarking field e.g. time or maxrss. When we say group it means a group of benchmarks. An input file may have benchmark results collected from multiple runs. By default each run is designated as a single benchmark group with the group name default. Benchmark groups from different runs are distinguished using a runId which is the index of the run in the file, starting with 0.

Benchmarks can be classified into multiple groups using classifyBenchmark. Benchmarks from each run can be divided into multiple groups. In a multi-run input benchmark groups can be fully specified using the groupname (either default or as classified by classifyBenchmark) and the runId.

Presentation

We can present the results in a textual format using report or as a graphical chart using graph. Each report consists of a number of benchmarks as rows and the columns can either be benchmarking fields or groups of benchmarks depending on the Presentation setting. In a graphical chart, we present multiple clusters, each cluster representing one column from the textual report, the rows (i.e. the benchmarks) are represented as bars in the cluster.

When the columns are groups, each report consists of results for a single benchmarking field for different benchmark groups. Using GroupStyle, we can further specify how we want to present the results the groups. We can either present absolute values of the field for each group or we can make the first group as a baseline and present differences from the baseline for the subsequent groups.

When the columns are fields, each report consists of results for a single benchmarking group. Fields cannot be compared like groups because they are of different types and have different measurement units.

The units in the report are automatically determined based on the minimum value in the range of values present. The ranges for fields can be overridden using fieldRanges.

Mean and Max

In a raw benchmark file (--csvraw=results.csv with gauge) we may have data for multiple iterations of each benchmark. BenchShow combines results of all iterations depending on the field type. For example if the field is time it takes the mean of all iterations and if the field is maxrss it takes the maximum of all iterations.

Tutorial and Examples

See the tutorial module BenchShow.Tutorial for sample charts and a comprehensive guide to generating reports and graphs. See the test directory for many usage examples, run the tests to see the charts generated by these tests.

Synopsis

Documentation

data GroupStyle Source #

How to show the results for multiple benchmark groups presented in columns or bar chart clusters.

Since: 0.2.0

Constructors

Absolute

Show absolute field values for all groups

Diff

Show baseline group values as usual and values for the subsequent groups as differences from the baseline

Percent

Show baseline group values as 100% and values for subsequent groups as a percentage of the baseline

PercentDiff

Show baseline group values as usual and values for subsequent groups as precentage difference from the baseline

Instances
Eq GroupStyle Source # 
Instance details

Defined in BenchShow.Common

Show GroupStyle Source # 
Instance details

Defined in BenchShow.Common

data Presentation Source #

How to present the reports or graphs. Each report presents a number of benchmarks as rows, it may have, (1) a single column presenting the values for a single field, (2) multiple columns presenting values for different fields, or (3) multiple columns presenting values of the same field for different groups.

Since: 0.2.0

Constructors

Solo

Reports are generated for each group and for each field selected by the configuration. Each report presents benchmarks in a single group with a single column presenting a single field. If there are m fields and n groups selected by the configuration then a total of m x n reports are generated. Output files are named using -estimator-groupname-fieldname as suffix.

Groups GroupStyle

One report is generated for each field selected by the configuration. Each report presents a field with all the groups selected by the configuration as columns or clusters. Output files are named using -estimator-fieldname as suffix.

Fields

One report is generated for each group selected by the configuration. Each report presents a group with all the fields selected by the configuration as columns or clusters. Output files are named using -estimator-groupname as suffix.

Instances
Eq Presentation Source # 
Instance details

Defined in BenchShow.Common

Show Presentation Source # 
Instance details

Defined in BenchShow.Common

data TitleAnnotation Source #

Additional annotations that can be optionally added to the title of the report or graph.

Since: 0.2.2

data Estimator Source #

The statistical estimator used to arrive at a single value for a benchmark when samples from multiple experiments are available.

Since: 0.2.0

Constructors

Median

Report the median, outliers and outlier variance using box-plot method. This is the most robust indicator with respect to outliers when successive runs of benchmarks are compared.

Mean

Report the mean and the standard deviation from the mean. This is less robust than median but more precise.

Regression

Report the coefficient of regression, discarding the constant factor, arrived at by linear regression using ordinary least square method. The R-square goodness-of-fit estimate is also reported. It works better when larger number of samples are taken. This cannot be used when the number of samples is less than 2, in that case a mean value is reported instead.

Instances
Eq Estimator Source # 
Instance details

Defined in BenchShow.Analysis

Show Estimator Source # 
Instance details

Defined in BenchShow.Analysis

data DiffStrategy Source #

Strategy to compute the difference between two groups of benchmarks being compared.

Since: 0.2.0

Constructors

SingleEstimator

Use a single estimator to compute the difference between the baseline and the candidate. The estimator that is provided in the Config is used.

MinEstimator

Use Mean, Median and Regression estimators for both baseline and candidate, and report the estimator that shows the minimum difference. This is more robust against random variations.

data SortColumn Source #

When sorting and filtering the benchmarks using selectBenchmarks we can choose a column as a sort criterion. selectBenchmarks is provided with the data for the corresponding column which can be used for sorting the benchmarks. The column could be a group or a field depending on the Presentation.

Since: 0.2.0

Constructors

ColumnIndex Int

Specify the index of the sort column. Index 0 corresponds to the first value column. In a textual report, the very first column consists of benchmark names, therefore index 0 addresses the second column of the report.

ColumnName (Either String (String, Int))

Specify the column using the name of the group or the field it represents, and the runId. When just the name is enough to uniquely identify the sort column the Left constructor can be used, otherwise the Right constructor is used which can use the runId to disambiguate. In a Fields presentation, just the field name is enough. In a Groups presentation, when there is a single benchmark run in the input file, just the group name is enough to identify the group, the runId defaults to 0. However, when there are multiple runs, a group needs to specify a runId as well.

data FieldTick Source #

FieldTick is used only in visual charts to generate the major ticks on the y-axis. You can specify either the size of a tick (TickSize) or the total number of ticks (TickCount).

Since: 0.2.0

Constructors

TickSize Int

Size of a tick, the unit is microseconds for time fields, and bytes for space fields.

TickCount Int

Total number of ticks in the range spread.

data Config Source #

Configuration governing generation of chart. See defaultConfig for the default values of these fields.

Since: 0.2.0

Constructors

Config 

Fields

  • verbose :: Bool

    Provide more details in the report, especially the standard deviation, outlier variance, R-square estimate and an annotation to indicate the actual method used when using MinEstimator are reported.

  • outputDir :: Maybe FilePath

    The directory where the output graph or report file should be placed.

  • title :: Maybe String

    Report title, more information like the plotted field name or the presentation style may be added to it.

  • titleAnnotations :: [TitleAnnotation]

    Additional annotations to be added to the title

  • presentation :: Presentation

    How to determine the layout of the report or the chart.

  • estimator :: Estimator

    The estimator used for the report.

  • threshold :: Word

    The minimum percentage difference between two runs of a benchmark beyond which the benchmark is flagged to have regressed or improved.

  • diffStrategy :: DiffStrategy

    Strategy to compare two runs or groups of benchmarks.

  • selectFields :: [String] -> [String]

    Filter and reorder the benchmarking fields. It is invoked with a list of all available benchmarking fields. Only those fields present in the output of this function are plotted and in that order.

  • fieldRanges :: [(String, Double, Double)]

    The values in the tuple are (fieldName, RangeMin, RangeMax). Specify the min and max range of benchmarking fields. If the field value is outside the range it is clipped to the range limit. For time fields, the range values are in microseconds, and for space fields they are in bytes. The minimum of the range is used to determine the unit for the field.

  • fieldTicks :: [(String, FieldTick)]

    The values in the tuple are (fieldName, tick). Specify the tick size of the fields to be used for the graphical reports.

  • classifyBenchmark :: String -> Maybe (String, String)

    Filter, group and translate benchmark names. This function is invoked once for all benchmark names found in the results. It produces a tuple (groupname, benchname), where groupname is the name of the group the benchmark should be placed in, and benchname is the translated benchmark name to be used in the report. If it returns Nothing for a benchmark, that benchmark is omitted from the results.

  • selectGroups :: [(String, Int)] -> [(String, Int)]

    Filter and reorder the benchmark group names. A benchmark group may be assigned using classifyBenchmark; when not assigned, all benchmarks are placed in the default group. The input to this function is a list of tuples with benchmark group names and the runIds. The output produced by this function is a filtered and reordered subset of the input. Only those benchmark groups present in the output are rendered and are presented in that order.

  • selectBenchmarks :: (SortColumn -> Either String [(String, Double)]) -> [String]

    Filter and reorder benchmarks. selectBenchmarks is provided with a function which is invoked with a sorting column name or index, the function produces the benchmark names and corresponding values for that column which can be used as a sorting criterion. The output of selectBenchmarks is a list of benchmarks in the order in which they are to be rendered.

defaultConfig :: Config Source #

Default configuration. Use this as the base configuration and modify the required fields. The defaults are:

 verbose           = False
 title             = Nothing
 titleAnnotations  = [TitleField]
 outputDir         = Nothing
 presentation      = Groups Absolute
 estimator         = Median
 threshold         = 3
 diffStrategy      = MinEstimator
 selectFields      = filter (flip elem ["time", "mean", "maxrss"] . map toLower)
 fieldRanges       = []
 fieldTicks        = []
 classifyBenchmark = Just . ("default",)
 selectGroups      = id
 selectBenchmarks  = f -> either error (map fst) $ f (ColumnIndex 0)

Since: 0.2.0

report :: FilePath -> Maybe FilePath -> Config -> IO () Source #

Presents the benchmark results in a CSV input file as text reports according to the provided configuration. The first parameter is the input file name. The second parameter, when specified using Just, is the name prefix for the output SVG image file(s). One or more output files may be generated with the given prefix depending on the Presentation setting. When the second parameter is Nothing the reports are printed on the console. The last parameter is the configuration to customize the report, you can start with defaultConfig as the base and override any of the fields that you may want to change.

For example:

report "bench-results.csv" Nothing defaultConfig

Since: 0.2.0

graph :: FilePath -> FilePath -> Config -> IO () Source #

Presents the benchmark results in a CSV input file as graphical bar charts according to the provided configuration. The first parameter is the input file name, the second parameter is the name prefix for the output SVG image file(s). One or more output files may be generated depending on the Presentation setting. The last parameter is the configuration to customize the graph, you can start with defaultConfig as the base and override any of the fields that you may want to change.

For example:

graph "bench-results.csv" "output-graph" defaultConfig

Since: 0.2.0