tasty-papi-0.1.2.0: Bencmarking using instruction counting
Copyright(c) 2023 Alexey Khudyakov
LicenseBSD3
Safe HaskellNone
LanguageHaskell2010

Test.Tasty.PAPI

Description

Benchmark framework which uses CPU instruction counting instead of time measurement. This approach is much more deterministic and not subject to variation caused by concurrent execution of other programs.

Hardware counters are accessedusing PAPI. Thus OS and hardware support inherited from that library.

How to use

Library uses standard approach for benchmarks. So example benchmarks looks similar to one using criterion, gauge or tasty-bench:

module Main where
import Test.Tasty.PAPI

main :: IO ()
main = defaultMain
  [ bench "6" $ whnf fib 6
  , bench "7" $ whnf fib 7
  , bench "8" $ whnf fib 8
  ]

fib :: Integer -> Integer
fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)

Its output is:

All
  6:                OK
    ALLOC=528  TOT_INS=4768    TOT_CYC=6114    BR_INS=1128     BR_MSP=74
  7:                OK
    ALLOC=864  TOT_INS=7431    TOT_CYC=6631    BR_INS=1744     BR_MSP=70
  8:                OK
    ALLOC=1408 TOT_INS=11.75e3 TOT_CYC=8540    BR_INS=2743     BR_MSP=93

Command line optiosn

Use --help to list command-line options. Below is list of options provided by this package:

--csv PATH

Write benchmark results into file in CSV format.

--counters COUNTER_SET

Adjust set of hardware counters to use. Refer to Counter for documentation on supported counters. By default TOT_INS, TOT_CYC, BR_INS, BR_MSP are measured. --counters INT_INS,FP_INS will *add* INT_INS and FP_INS to default set. --counters =INT_INS,FP_INS will use only list of provided counters. Note that counter may or may not be supported on you CPU and there could be limit on number of counters used simultaneously.

Synopsis

Running benchmarks

defaultMain :: [TestTree] -> IO () Source #

Run benchmark suite. It provides API compatible with criterion and gauge.

Since: 0.1

type Benchmark = TestTree Source #

Just a TestTree. This type synonym is provided for source compatibility with criterion and gauge.

Since: 0.1

newtype Benchmarkable Source #

IO action which could be benchmarked. It's created by whnf, nf, whnfIO, nfIO.

Since: 0.1

Constructors

Benchmarkable (IO ()) 

Instances

Instances details
IsTest Benchmarkable Source # 
Instance details

Defined in Test.Tasty.PAPI

bench :: String -> Benchmarkable -> TestTree Source #

Create single benchmark. This is just a monomorphization of singleTest which provides API compatibility with criterion and gauge.

Since: 0.1

bgroup :: String -> [Benchmark] -> Benchmark Source #

Create single benchmark. This is just a testGroup and it exists to provide API compatibility with criterion and gauge.

Since: 0.1

Creation of Benchmarkable

nf :: NFData b => (a -> b) -> a -> Benchmarkable Source #

nf f x measures number of instructions needed to compute normal form of and application of f to x.

Since: 0.1

whnf :: (a -> b) -> a -> Benchmarkable Source #

nf f x measures number of instructions needed to compute weak head normal form of and application of f to x.

Since: 0.1

nfIO :: NFData a => IO a -> Benchmarkable Source #

nfIO a measures number of instructions needed to evaluate IO action and reduce value returned by it to normal form.

Since: 0.1

whnfIO :: IO a -> Benchmarkable Source #

whnfIO a measures number of instructions needed to evaluate IO action and reduce value returned by it to weak head normal form.

Since: 0.1

Ingredients

benchIngredients :: [Ingredient] Source #

Standard set of ingredients which are used by defaultMain

Since: 0.1.2.0

consoleBenchReporter :: Ingredient Source #

Reporter which prints results on benchmarks to stdout.

Since: 0.1.2.0

csvReporter :: Ingredient Source #

Run benchmarks and save results in CSV format. It activates when --csv FILE command line option is specified.

Since: 0.1.2.0

Data types

data Counter Source #

Supported hardware counters

Documentation is taken from rather outdated manual: https://icl.utk.edu/projects/papi/files/documentation/PAPI_USER_GUIDE_23.htm

Constructors

L1_DCM

Level 1 data cache misses

L1_ICM

Level 1 instruction cache misses

L2_DCM

Level 2 data cache misses

L2_ICM

Level 2 instruction cache misses

L3_DCM

Level 3 data cache misses

L3_ICM

Level 3 instruction cache misses

L1_TCM

Level 1 total cache misses

L2_TCM

Level 2 total cache misses

L3_TCM

Level 3 total cache misses

CA_SNP

Requests for a Snoop

CA_SHR

Requests for access to shared cache line (SMP)

CA_CLN

Requests for access to clean cache line (SMP)

CA_INV

Cache Line Invalidation (SMP)

CA_ITV

Cache Line Intervention (SMP)

L3_LDM

Level 3 load misses

L3_STM

Level 3 store misses

BRU_IDL

Cycles branch units are idle

FXU_IDL

Cycles integer units are idle

FPU_IDL

Cycles floating point units are idle

LSU_IDL

Cycles load/store units are idle

TLB_DM

Data translation lookaside buffer misses

TLB_IM

Instruction translation lookaside buffer misses

TLB_TL

Total translation lookaside buffer misses

L1_LDM

Level 1 load misses

L1_STM

Level 1 store misses

L2_LDM

Level 2 load misses

L2_STM

Level 2 store misses

BTAC_M

Branch target address cache (BTAC) misses

PRF_DM

Pre-fetch data instruction caused a miss

L3_DCH

Level 3 Data Cache Hit

TLB_SD

Translation lookaside buffer shootdowns (SMP)

CSR_FAL

Failed store conditional instructions

CSR_SUC

Successful store conditional instructions

CSR_TOT

Total store conditional instructions

MEM_SCY

Cycles Stalled Waiting for Memory Access

MEM_RCY

Cycles Stalled Waiting for Memory Read

MEM_WCY

Cycles Stalled Waiting for Memory Write

STL_ICY

Cycles with No Instruction Issue

FUL_ICY

Cycles with Maximum Instruction Issue

STL_CCY

Cycles with No Instruction Completion

FUL_CCY

Cycles with Maximum Instruction Completion

HW_INT

Hardware interrupts

BR_UCN

Unconditional branch instructions executed

BR_CN

Conditional branch instructions executed

BR_TKN

Conditional branch instructions taken

BR_NTK

Conditional branch instructions not taken

BR_MSP

Conditional branch instructions mispredicted

BR_PRC

Conditional branch instructions correctly predicted

FMA_INS

FMA instructions completed

TOT_IIS

Total instructions issued

TOT_INS

Total instructions executed

INT_INS

Integer instructions executed

FP_INS

Floating point instructions executed

LD_INS

Load instructions executed

SR_INS

Store instructions executed

BR_INS

Total branch instructions executed

VEC_INS

Vector/SIMD instructions executed

RES_STL

Cycles processor is stalled on resource

FP_STAL

Cycles any FP units are stalled

TOT_CYC

Total cycles

LST_INS

Total load/store instructions executed

SYC_INS

Synchronization instructions executed

L1_DCH

L1 data cache hits

L2_DCH

L2 data cache hits

L1_DCA

L1 data cache accesses

L2_DCA

L2 data cache accesses

L3_DCA

L3 data cache accesses

L1_DCR

L1 data cache reads

L2_DCR

L2 data cache reads

L3_DCR

L3 data cache reads

L1_DCW

L1 data cache writes

L2_DCW

L2 data cache writes

L3_DCW

L3 data cache writes

L1_ICH

L1 instruction cache hits

L2_ICH

L2 instruction cache hits

L3_ICH

L3 instruction cache hits

L1_ICA

L1 instruction cache accesses

L2_ICA

L2 instruction cache accesses

L3_ICA

L3 instruction cache accesses

L1_ICR

L1 instruction cache reads

L2_ICR

L2 instruction cache reads

L3_ICR

L3 instruction cache reads

L1_ICW

L1 instruction cache writes

L2_ICW

L2 instruction cache writes

L3_ICW

L3 instruction cache writes

L1_TCH

L1 total cache hits

L2_TCH

L2 total cache hits

L3_TCH

L3 total cache hits

L1_TCA

L1 total cache accesses

L2_TCA

L2 total cache accesses

L3_TCA

L3 total cache accesses

L1_TCR

L1 total cache reads

L2_TCR

L2 total cache reads

L3_TCR

L3 total cache reads

L1_TCW

L1 total cache writes

L2_TCW

L2 total cache writes

L3_TCW

L3 total cache writes

FML_INS

Floating Multiply instructions

FAD_INS

Floating Add instructions

FDV_INS

Floating Divide instructions

FSQ_INS

Floating Square Root instructions

FNV_INS

Floating Inverse instructions

Instances

Instances details
Read Counter Source # 
Instance details

Defined in Test.Tasty.PAPI

Show Counter Source # 
Instance details

Defined in Test.Tasty.PAPI

Eq Counter Source # 
Instance details

Defined in Test.Tasty.PAPI

Methods

(==) :: Counter -> Counter -> Bool #

(/=) :: Counter -> Counter -> Bool #

Ord Counter Source # 
Instance details

Defined in Test.Tasty.PAPI