Copyright | (c) 2016 Michael Walker |
---|---|

License | MIT |

Maintainer | Michael Walker <mike@barrucadu.co.uk> |

Stability | experimental |

Portability | RankNTypes |

Safe Haskell | None |

Language | Haskell2010 |

Deterministic testing for concurrent computations.

As an example, consider this program, which has two locks and a shared variable. Two threads are spawned, which claim the locks, update the shared variable, and release the locks. The main thread waits for them both to terminate, and returns the final result.

example1 :: MonadConc m => m Int example1 = do a <- newEmptyMVar b <- newEmptyMVar c <- newMVar 0 let lock m = putMVar m () let unlock = takeMVar j1 <- spawn $ lock a >> lock b >> modifyMVar_ c (return . succ) >> unlock b >> unlock a j2 <- spawn $ lock b >> lock a >> modifyMVar_ c (return . pred) >> unlock a >> unlock b takeMVar j1 takeMVar j2 takeMVar c

The correct result is 0, as it starts out as 0 and is incremented
and decremented by threads 1 and 2, respectively. However, note the
order of acquisition of the locks in the two threads. If thread 2
pre-empts thread 1 between the acquisition of the locks (or if
thread 1 pre-empts thread 2), a deadlock situation will arise, as
thread 1 will have lock `a`

and be waiting on `b`

, and thread 2
will have `b`

and be waiting on `a`

.

Here is what Deja Fu has to say about it:

> autocheck example1 [fail] Never Deadlocks (checked: 5) [deadlock] S0------------S1-P2--S1- [pass] No Exceptions (checked: 12) [fail] Consistent Result (checked: 11) 0 S0------------S2-----------------S1-----------------S0---- [deadlock] S0------------S1-P2--S1- False

It identifies the deadlock, and also the possible results the
computation can produce, and displays a simplified trace leading to
each failing outcome. The trace contains thread numbers, and the
names (which can be set by the programmer) are displayed beneath.
It also returns `False`

as there are test failures. The automatic
testing functionality is good enough if you only want to check your
computation is deterministic, but if you have more specific
requirements (or have some expected and tolerated level of
nondeterminism), you can write tests yourself using the `dejafu*`

functions.

**Warning:** If your computation under test does `IO`

, the `IO`

will be executed lots of times! Be sure that it is deterministic
enough not to invalidate your test results. Mocking may be useful
where possible.

- autocheck :: (Eq a, Show a) => (forall t. ConcST t a) -> IO Bool
- dejafu :: Show a => (forall t. ConcST t a) -> (String, Predicate a) -> IO Bool
- dejafus :: Show a => (forall t. ConcST t a) -> [(String, Predicate a)] -> IO Bool
- autocheckIO :: (Eq a, Show a) => ConcIO a -> IO Bool
- dejafuIO :: Show a => ConcIO a -> (String, Predicate a) -> IO Bool
- dejafusIO :: Show a => ConcIO a -> [(String, Predicate a)] -> IO Bool
- data Way
- defaultWay :: Way
- systematically :: Bounds -> Way
- randomly :: RandomGen g => g -> Int -> Way
- uniformly :: RandomGen g => g -> Int -> Way
- swarmy :: RandomGen g => g -> Int -> Int -> Way
- autocheckWay :: (Eq a, Show a) => Way -> MemType -> (forall t. ConcST t a) -> IO Bool
- autocheckWayIO :: (Eq a, Show a) => Way -> MemType -> ConcIO a -> IO Bool
- dejafuWay :: Show a => Way -> MemType -> (forall t. ConcST t a) -> (String, Predicate a) -> IO Bool
- dejafuWayIO :: Show a => Way -> MemType -> ConcIO a -> (String, Predicate a) -> IO Bool
- dejafusWay :: Show a => Way -> MemType -> (forall t. ConcST t a) -> [(String, Predicate a)] -> IO Bool
- dejafusWayIO :: Show a => Way -> MemType -> ConcIO a -> [(String, Predicate a)] -> IO Bool
- data Discard
- defaultDiscarder :: Either Failure a -> Maybe Discard
- dejafuDiscard :: Show a => (Either Failure a -> Maybe Discard) -> Way -> MemType -> (forall t. ConcST t a) -> (String, Predicate a) -> IO Bool
- dejafuDiscardIO :: Show a => (Either Failure a -> Maybe Discard) -> Way -> MemType -> ConcIO a -> (String, Predicate a) -> IO Bool
- data MemType
- defaultMemType :: MemType
- data Bounds = Bounds {}
- defaultBounds :: Bounds
- noBounds :: Bounds
- newtype PreemptionBound = PreemptionBound Int
- defaultPreemptionBound :: PreemptionBound
- newtype FairBound = FairBound Int
- defaultFairBound :: FairBound
- newtype LengthBound = LengthBound Int
- defaultLengthBound :: LengthBound
- data Result a = Result {
- _pass :: Bool
- _casesChecked :: Int
- _failures :: [(Either Failure a, Trace)]
- _failureMsg :: String

- data Failure
- runTest :: Predicate a -> (forall t. ConcST t a) -> Result a
- runTestWay :: Way -> MemType -> Predicate a -> (forall t. ConcST t a) -> Result a
- runTestM :: MonadRef r n => Predicate a -> ConcT r n a -> n (Result a)
- runTestWayM :: MonadRef r n => Way -> MemType -> Predicate a -> ConcT r n a -> n (Result a)
- type Predicate a = [(Either Failure a, Trace)] -> Result a
- representative :: Eq a => Predicate a -> Predicate a
- abortsNever :: Predicate a
- abortsAlways :: Predicate a
- abortsSometimes :: Predicate a
- deadlocksNever :: Predicate a
- deadlocksAlways :: Predicate a
- deadlocksSometimes :: Predicate a
- exceptionsNever :: Predicate a
- exceptionsAlways :: Predicate a
- exceptionsSometimes :: Predicate a
- alwaysSame :: Eq a => Predicate a
- notAlwaysSame :: Eq a => Predicate a
- alwaysTrue :: (Either Failure a -> Bool) -> Predicate a
- alwaysTrue2 :: (Either Failure a -> Either Failure a -> Bool) -> Predicate a
- somewhereTrue :: (Either Failure a -> Bool) -> Predicate a
- gives :: (Eq a, Show a) => [Either Failure a] -> Predicate a
- gives' :: (Eq a, Show a) => [a] -> Predicate a
- isInternalError :: Failure -> Bool
- isAbort :: Failure -> Bool
- isDeadlock :: Failure -> Bool
- isUncaughtException :: Failure -> Bool
- isIllegalSubconcurrency :: Failure -> Bool
- module Test.DejaFu.Refinement

# Testing

Testing in Deja Fu is similar to unit testing, the programmer produces a self-contained monadic action to execute under different schedules, and supplies a list of predicates to apply to the list of results produced.

If you simply wish to check that something is deterministic, see
the `autocheck`

and `autocheckIO`

functions.

These functions use a Total Store Order (TSO) memory model for unsynchronised actions, see "Testing under Alternative Memory Models" for some explanation of this.

Automatically test a computation. In particular, look for deadlocks, uncaught exceptions, and multiple return values.

This uses the `Conc`

monad for testing, which is an instance of
`MonadConc`

. If you need to test something which also uses
`MonadIO`

, use `autocheckIO`

.

*Since: 0.1.0.0*

:: Show a | |

=> (forall t. ConcST t a) | The computation to test |

-> (String, Predicate a) | The predicate (with a name) to check |

-> IO Bool |

Check a predicate and print the result to stdout, return `True`

if it passes.

*Since: 0.1.0.0*

# Testing with different settings

How to explore the possible executions of a concurrent program.

*Since: 0.7.0.0*

defaultWay :: Way Source #

A default way to execute concurrent programs: systematically
using `defaultBounds`

.

*Since: 0.6.0.0*

Systematically execute a program, trying all distinct executions within the bounds.

This corresponds to `sctBound`

.

*Since: 0.7.0.0*

:: RandomGen g | |

=> g | The random generator to drive the scheduling. |

-> Int | The number of executions to try. |

-> Way |

Randomly execute a program, exploring a fixed number of executions.

Threads are scheduled by a weighted random selection, where weights are assigned randomly on thread creation.

This corresponds to `sctWeightedRandom`

with weight re-use
disabled, and is not guaranteed to find all distinct results
(unlike `systematically`

/ `sctBound`

).

*Since: 0.7.0.0*

:: RandomGen g | |

=> g | The random generator to drive the scheduling. |

-> Int | The number of executions to try. |

-> Way |

Randomly execute a program, exploring a fixed number of executions.

Threads are scheduled by a uniform random selection.

This corresponds to `sctUniformRandom`

, and is not guaranteed to
find all distinct results (unlike `systematically`

/ `sctBound`

).

*Since: 0.7.0.0*

:: RandomGen g | |

=> g | The random generator to drive the scheduling. |

-> Int | The number of executions to try. |

-> Int | The number of executions to use the thread weights for. |

-> Way |

Randomly execute a program, exploring a fixed number of executions.

Threads are scheduled by a weighted random selection, where weights are assigned randomly on thread creation.

This corresponds to `sctWeightedRandom`

, and is not guaranteed to
find all distinct results (unlike `systematically`

/ `sctBound`

).

*Since: 0.7.0.0*

:: (Eq a, Show a) | |

=> Way | How to run the concurrent program. |

-> MemType | The memory model to use for non-synchronised |

-> (forall t. ConcST t a) | The computation to test |

-> IO Bool |

Variant of `autocheck`

which takes a way to run the program and a
memory model.

Schedule bounding is used to filter the large number of possible
schedules, and can be iteratively increased for further coverage
guarantees. Empirical studies (/Concurrency Testing Using Schedule
Bounding: an Empirical Study/, P. Thompson, A. Donaldson, and
A. Betts) have found that many concurrency bugs can be exhibited
with as few as two threads and two pre-emptions, which is part of
what `dejafus`

uses.

**Warning:** Using largers bounds will almost certainly
significantly increase the time taken to test!

*Since: 0.6.0.0*

autocheckWayIO :: (Eq a, Show a) => Way -> MemType -> ConcIO a -> IO Bool Source #

Variant of `autocheckWay`

for computations which do `IO`

.

*Since: 0.6.0.0*

:: Show a | |

=> Way | How to run the concurrent program. |

-> MemType | The memory model to use for non-synchronised |

-> (forall t. ConcST t a) | The computation to test |

-> (String, Predicate a) | The predicate (with a name) to check |

-> IO Bool |

Variant of `dejafu`

which takes a way to run the program and a
memory model.

*Since: 0.6.0.0*

:: Show a | |

=> Way | How to run the concurrent program. |

-> MemType | The memory model to use for non-synchronised |

-> (forall t. ConcST t a) | The computation to test |

-> [(String, Predicate a)] | The list of predicates (with names) to check |

-> IO Bool |

Variant of `dejafus`

which takes a way to run the program and a
memory model.

*Since: 0.6.0.0*

dejafusWayIO :: Show a => Way -> MemType -> ConcIO a -> [(String, Predicate a)] -> IO Bool Source #

Variant of `dejafusWay`

for computations which do `IO`

.

*Since: 0.6.0.0*

An `Either Failure a -> Maybe Discard`

value can be used to
selectively discard results.

*Since: 0.7.1.0*

DiscardTrace | Discard the trace but keep the result. The result will appear to have an empty trace. |

DiscardResultAndTrace | Discard the result and the trace. It will simply not be reported as a possible behaviour of the program. |

defaultDiscarder :: Either Failure a -> Maybe Discard Source #

Do not discard any results.

*Since: 0.7.1.0*

:: Show a | |

=> (Either Failure a -> Maybe Discard) | Selectively discard results. |

-> Way | How to run the concurrent program. |

-> MemType | The memory model to use for non-synchronised |

-> (forall t. ConcST t a) | The computation to test |

-> (String, Predicate a) | The predicate (with a name) to check |

-> IO Bool |

Variant of `dejafuWay`

which can selectively discard results.

*Since: 0.7.1.0*

dejafuDiscardIO :: Show a => (Either Failure a -> Maybe Discard) -> Way -> MemType -> ConcIO a -> (String, Predicate a) -> IO Bool Source #

Variant of `dejafuDiscard`

for computations which do `IO`

.

*Since: 0.7.1.0*

## Memory Models

Threads running under modern multicore processors do not behave
as a simple interleaving of the individual thread
actions. Processors do all sorts of complex things to increase
speed, such as buffering writes. For concurrent programs which
make use of non-synchronised functions (such as `readCRef`

coupled with `writeCRef`

) different memory models may yield
different results.

As an example, consider this program (modified from the
Data.IORef documentation). Two `CRef`

s are created, and two
threads spawned to write to and read from both. Each thread
returns the value it observes.

example2 :: MonadConc m => m (Bool, Bool) example2 = do r1 <- newCRef False r2 <- newCRef False x <- spawn $ writeCRef r1 True >> readCRef r2 y <- spawn $ writeCRef r2 True >> readCRef r1 (,) <$> readMVar x <*> readMVar y

Under a sequentially consistent memory model the possible results
are `(True, True)`

, `(True, False)`

, and `(False, True)`

. Under
total or partial store order, `(False, False)`

is also a possible
result, even though there is no interleaving of the threads which
can lead to this.

We can see this by testing with different memory models:

> autocheckWay defaultWay SequentialConsistency example2 [pass] Never Deadlocks (checked: 6) [pass] No Exceptions (checked: 6) [fail] Consistent Result (checked: 5) (False,True) S0-------S1-----S0--S2-----S0--- (True,False) S0-------S1-P2-----S1----S0---- (True,True) S0-------S1--P2-----S1---S0---- (False,True) S0-------S1---P2-----S1--S0---- (True,False) S0-------S2-----S1-----S0---- ... False

> autocheckWay defaultWay TotalStoreOrder example2 [pass] Never Deadlocks (checked: 303) [pass] No Exceptions (checked: 303) [fail] Consistent Result (checked: 302) (False,True) S0-------S1-----C-S0--S2-----C-S0--- (True,False) S0-------S1-P2-----C-S1----S0---- (True,True) S0-------S1-P2--C-S1----C-S0--S2---S0--- (False,True) S0-------S1-P2--P1--C-C-S1--S0--S2---S0--- (False,False) S0-------S1-P2--P1----S2---C-C-S0---- ... False

Traces for non-sequentially-consistent memory models show where
writes to `CRef`

s are *committed*, which makes a write visible to
all threads rather than just the one which performed the
write. Only `writeCRef`

is broken up into separate write and
commit steps, `atomicModifyCRef`

is still atomic and imposes a
memory barrier.

The memory model to use for non-synchronised `CRef`

operations.

*Since: 0.4.0.0*

SequentialConsistency | The most intuitive model: a program behaves as a simple
interleaving of the actions in different threads. When a |

TotalStoreOrder | Each thread has a write buffer. A thread sees its writes immediately, but other threads will only see writes when they are committed, which may happen later. Writes are committed in the same order that they are created. |

PartialStoreOrder | Each |

defaultMemType :: MemType Source #

The default memory model: `TotalStoreOrder`

*Since: 0.2.0.0*

## Schedule Bounding

Schedule bounding is an optimisation which only considers
schedules within some *bound*. This sacrifices completeness
outside of the bound, but can drastically reduce the number of
schedules to test, and is in fact necessary for non-terminating
programs.

The standard testing mechanism uses a combination of pre-emption bounding, fair bounding, and length bounding. Pre-emption + fair bounding is useful for programs which use loop/yield control flows but are otherwise terminating. Length bounding makes it possible to test potentially non-terminating programs.

*Since: 0.2.0.0*

defaultBounds :: Bounds Source #

All bounds enabled, using their default values.

*Since: 0.2.0.0*

No bounds enabled. This forces the scheduler to just use
partial-order reduction and sleep sets to prune the search
space. This will *ONLY* work if your computation always terminates!

*Since: 0.3.0.0*

newtype PreemptionBound Source #

*Since: 0.2.0.0*

defaultPreemptionBound :: PreemptionBound Source #

A sensible default preemption bound: 2.

See *Concurrency Testing Using Schedule Bounding: an Empirical Study*,
P. Thomson, A. F. Donaldson, A. Betts for justification.

*Since: 0.2.0.0*

*Since: 0.2.0.0*

defaultFairBound :: FairBound Source #

A sensible default fair bound: 5.

This comes from playing around myself, but there is probably a better default.

*Since: 0.2.0.0*

newtype LengthBound Source #

*Since: 0.2.0.0*

defaultLengthBound :: LengthBound Source #

A sensible default length bound: 250.

Based on the assumption that anything which executes for much longer (or even this long) will take ages to test.

*Since: 0.2.0.0*

# Results

The results of a test can be pretty-printed to the console, as with the above functions, or used in their original, much richer, form for debugging purposes. These functions provide full access to this data type which, most usefully, contains a detailed trace of execution, showing what each thread did at each point.

The results of a test, including the number of cases checked to determine the final boolean outcome.

*Since: 0.2.0.0*

An indication of how a concurrent computation failed.

The `Eq`

, `Ord`

, and `NFData`

instances compare/evaluate the
exception with `show`

in the `UncaughtException`

case.

*Since: 0.9.0.0*

InternalError | Will be raised if the scheduler does something bad. This should never arise unless you write your own, faulty, scheduler! If it does, please file a bug report. |

Abort | The scheduler chose to abort execution. This will be produced if, for example, all possible decisions exceed the specified bounds (there have been too many pre-emptions, the computation has executed for too long, or there have been too many yields). |

Deadlock | The computation became blocked indefinitely on |

STMDeadlock | The computation became blocked indefinitely on |

UncaughtException SomeException | An uncaught exception bubbled to the top of the computation. |

IllegalSubconcurrency | Calls to |

Run a predicate over all executions within the default schedule bounds.

*Since: 0.1.0.0*

:: Way | How to run the concurrent program. |

-> MemType | The memory model to use for non-synchronised |

-> Predicate a | The predicate to check |

-> (forall t. ConcST t a) | The computation to test |

-> Result a |

Variant of `runTest`

which takes a way to run the program and a
memory model.

*Since: 0.6.0.0*

runTestM :: MonadRef r n => Predicate a -> ConcT r n a -> n (Result a) Source #

Monad-polymorphic variant of `runTest`

.

*Since: 0.4.0.0*

runTestWayM :: MonadRef r n => Way -> MemType -> Predicate a -> ConcT r n a -> n (Result a) Source #

Monad-polymorphic variant of `runTest'`

.

*Since: 0.6.0.0*

# Predicates

Predicates evaluate a list of results of execution and decide whether some test case has passed or failed. They can be lazy and make use of short-circuit evaluation to avoid needing to examine the entire list of results, and can check any property which can be defined for the return type of your monadic action.

A collection of common predicates are provided, along with the
helper functions `alwaysTrue`

, `alwaysTrue2`

and `somewhereTrue`

to lfit predicates over a single result to over a collection of
results.

type Predicate a = [(Either Failure a, Trace)] -> Result a Source #

A `Predicate`

is a function which collapses a list of results
into a `Result`

.

*Since: 0.1.0.0*

representative :: Eq a => Predicate a -> Predicate a Source #

Reduce the list of failures in a `Predicate`

to one
representative trace for each unique result.

This may throw away "duplicate" failures which have a unique cause but happen to manifest in the same way. However, it is convenient for filtering out true duplicates.

*Since: 0.2.0.0*

abortsNever :: Predicate a Source #

Check that a computation never aborts.

*Since: 0.2.0.0*

abortsAlways :: Predicate a Source #

Check that a computation always aborts.

*Since: 0.2.0.0*

abortsSometimes :: Predicate a Source #

Check that a computation aborts at least once.

*Since: 0.2.0.0*

deadlocksNever :: Predicate a Source #

Check that a computation never deadlocks.

*Since: 0.1.0.0*

deadlocksAlways :: Predicate a Source #

Check that a computation always deadlocks.

*Since: 0.1.0.0*

deadlocksSometimes :: Predicate a Source #

Check that a computation deadlocks at least once.

*Since: 0.1.0.0*

exceptionsNever :: Predicate a Source #

Check that a computation never fails with an uncaught exception.

*Since: 0.1.0.0*

exceptionsAlways :: Predicate a Source #

Check that a computation always fails with an uncaught exception.

*Since: 0.1.0.0*

exceptionsSometimes :: Predicate a Source #

Check that a computation fails with an uncaught exception at least once.

*Since: 0.1.0.0*

alwaysSame :: Eq a => Predicate a Source #

Check that the result of a computation is always the same. In particular this means either: (a) it always fails in the same way, or (b) it never fails and the values returned are all equal.

*Since: 0.1.0.0*

notAlwaysSame :: Eq a => Predicate a Source #

Check that the result of a computation is not always the same.

*Since: 0.1.0.0*

alwaysTrue :: (Either Failure a -> Bool) -> Predicate a Source #

Check that the result of a unary boolean predicate is always true.

*Since: 0.1.0.0*

alwaysTrue2 :: (Either Failure a -> Either Failure a -> Bool) -> Predicate a Source #

Check that the result of a binary boolean predicate is true between all pairs of results. Only properties which are transitive and symmetric should be used here.

If the predicate fails, *both* (result,trace) tuples will be added
to the failures list.

*Since: 0.1.0.0*

somewhereTrue :: (Either Failure a -> Bool) -> Predicate a Source #

Check that the result of a unary boolean predicate is true at least once.

*Since: 0.1.0.0*

gives :: (Eq a, Show a) => [Either Failure a] -> Predicate a Source #

Predicate for when there is a known set of results where every result must be exhibited at least once.

*Since: 0.2.0.0*

gives' :: (Eq a, Show a) => [a] -> Predicate a Source #

Variant of `gives`

that doesn't allow for expected failures.

*Since: 0.2.0.0*

## Failures

isInternalError :: Failure -> Bool Source #

Check if a failure is an `InternalError`

.

*Since: 0.9.0.0*

isDeadlock :: Failure -> Bool Source #

Check if a failure is a `Deadlock`

or an `STMDeadlock`

.

*Since: 0.9.0.0*

isUncaughtException :: Failure -> Bool Source #

Check if a failure is an `UncaughtException`

*Since: 0.9.0.0*

isIllegalSubconcurrency :: Failure -> Bool Source #

Check if a failure is an `IllegalSubconcurrency`

*Since: 0.9.0.0*

# Refinement property testing

Consider this statement about `MVar`

s: "using `readMVar`

is
better than `takeMVar`

followed by `putMVar`

because the former
is atomic but the latter is not."

Deja Fu can test properties like that:

sig e = Sig { initialise = maybe newEmptyMVar newMVar , observe = \v _ -> tryReadMVar v , interfere = \v s -> tryTakeMVar v >> maybe (pure ()) (void . tryPutMVar v) s , expression = e } > check $ sig (void . readMVar) `equivalentTo` sig (\v -> takeMVar v >>= putMVar v) *** Failure: (seed Just ()) left: [(Nothing,Just ())] right: [(Nothing,Just ()),(Just Deadlock,Just ())]

The two expressions are not equivalent, and we get given the counterexample!

module Test.DejaFu.Refinement