Copyright | 2021 Dominik Schrempf |
---|---|
License | GPL-3.0-or-later |
Maintainer | dominik.schrempf@gmail.com |
Stability | unstable |
Portability | portable |
Safe Haskell | Safe-Inferred |
Language | Haskell2010 |
Creation date: Tue May 5 18:01:15 2020.
For an introduction to Markov chain Monte Carlo (MCMC) samplers and update mechanisms using the Metropolis-Hastings-Green algorithm, please see Geyer, C. J., (2011), Introduction to Markov Chain Monte Carlo, In Handbook of Markov Chain Monte Carlo (pp. 45), CRC press.
This library focusses on classical Markov chain Monte Carlo algorithms such as the Metropolis-Hastings-Green (MHG) [1] algorithm, or population methods involving parallel chains such as the Metropolic-coupled Markov chain Monte Carlo [2] algorithm. In particular, sequential Monte Carlo [3] algorithms following a moving posterior distribution are not provided. Recently, Hamiltonian Monte Carlo (HMC) proposals have been added [4]. HMC proposals can be used with automatic differentiation. HMC proposals with automatic differentiation are quite slow for complicated prior or likelihood functions, but they are incredibly useful when specialized MHG proposals are not readily available.
An MCMC sampler can be run with mcmc
, for example using the
Metropolis-Hastings-Green algorithm mhg
.
Usually, it is best to start with an example:
- Basic inference of the accuracy of an archer (see the statement of the problem).
- Other examples.
- More involved example performing phylogenetic dating.
The import of this module alone should cover most use cases.
[1]
Geyer, C. J. (2011), Introduction to markov chain monte carlo, In
Handbook of Markov Chain Monte Carlo (pp. 45), CRC press.
[2]
Geyer, C. J. (1991), Markov chain monte carlo maximum likelihood,
Computing Science and Statistics, Proceedings of the 23rd Symposium on the
Interface.
[3]
Sequential monte carlo methods in practice (2001), Editors: Arnaud
Doucet, Nando de Freitas, and Neil Gordon, Springer New York.
[4]
Review by Betancourt and notes: Betancourt, M., A conceptual
introduction to Hamiltonian Monte Carlo, arXiv, 1701–02434 (2017).
Synopsis
- newtype PName = PName {}
- data PWeight
- pWeight :: Int -> PWeight
- data Proposal a
- (@~) :: Lens' b a -> Proposal a -> Proposal b
- liftProposal :: Lens' b a -> Proposal a -> Proposal b
- liftProposalWith :: JacobianFunction b -> Lens' b a -> Proposal a -> Proposal b
- data Tune
- scale :: Shape Double -> Scale Double -> PName -> PWeight -> Tune -> Proposal Double
- scaleUnbiased :: Shape Double -> PName -> PWeight -> Tune -> Proposal Double
- scaleContrarily :: Shape Double -> Scale Double -> PName -> PWeight -> Tune -> Proposal (Double, Double)
- scaleBactrian :: SpikeParameter -> StandardDeviation Double -> PName -> PWeight -> Tune -> Proposal Double
- slide :: Mean Double -> StandardDeviation Double -> PName -> PWeight -> Tune -> Proposal Double
- slideSymmetric :: StandardDeviation Double -> PName -> PWeight -> Tune -> Proposal Double
- slideUniformSymmetric :: Size -> PName -> PWeight -> Tune -> Proposal Double
- slideContrarily :: Mean Double -> StandardDeviation Double -> PName -> PWeight -> Tune -> Proposal (Double, Double)
- slideBactrian :: SpikeParameter -> StandardDeviation Double -> PName -> PWeight -> Tune -> Proposal Double
- module Mcmc.Proposal.Simplex
- module Mcmc.Proposal.Hamiltonian.Common
- module Mcmc.Proposal.Hamiltonian.Hamiltonian
- module Mcmc.Proposal.Hamiltonian.Nuts
- data Cycle a
- cycleFromList :: [Proposal a] -> Cycle a
- data Order
- setOrder :: Order -> Cycle a -> Cycle a
- module Mcmc.Settings
- data Monitor a = Monitor (MonitorStdOut a) [MonitorFile a] [MonitorBatch a]
- data MonitorStdOut a
- monitorStdOut :: [MonitorParameter a] -> Period -> MonitorStdOut a
- data MonitorFile a
- monitorFile :: String -> [MonitorParameter a] -> Period -> MonitorFile a
- data MonitorBatch a
- monitorBatch :: String -> [MonitorParameterBatch a] -> BatchSize -> MonitorBatch a
- simpleMonitor :: Period -> Monitor a
- module Mcmc.Monitor.Parameter
- module Mcmc.Monitor.ParameterBatch
- module Mcmc.Prior
- module Mcmc.Likelihood
- module Mcmc.Jacobian
- module Mcmc.Posterior
- mcmc :: Algorithm a => Settings -> a -> IO a
- mcmcContinue :: Algorithm a => Iterations -> Settings -> a -> IO a
- module Mcmc.Algorithm.MHG
- module Mcmc.Algorithm.MC3
- module Mcmc.MarginalLikelihood
- module Mcmc.Statistics.Types
- newtype Log a = Exp {
- ln :: a
Proposals
A Proposal
is an instruction about how to advance a given Markov
chain so that it possibly reaches a new state. That is, Proposal
s
specify how the chain traverses the state space. As far as this MCMC
library is concerned, Proposal
s are considered to be /elementary
updates/ in that they cannot be decomposed into smaller updates.
Proposal
s can be combined to form composite updates, a technique often
referred to as composition. On the other hand, mixing (used in the
sense of mixture models) is the random choice of a Proposal
(or a
composition of Proposal
s) from a given set.
The composition and mixture of Proposal
s allows specification
of nearly all MCMC algorithms involving a single chain (i.e., population
methods such as particle filters are excluded). In particular, Gibbs
samplers of all sorts can be specified using this procedure. For
reference, please see the short encyclopedia of MCMC
methods.
This library enables composition and mixture of Proposal
s via the
Cycle
data type. Essentially, a Cycle
is a set of Proposal
s. The
chain advances after the completion of each Cycle
, which is called an
iteration, and the iteration counter is increased by one.
The Proposal
s in a Cycle
can be executed in the given order or in a
random sequence which allows, for example, specification of a fixed scan
Gibbs sampler, or a random sequence scan Gibbs sampler, respectively. See
Order
.
Notes:
- It is important that the given Cycle
enables traversal of the
complete state space. Otherwise, the Markov chain will not converge to
the correct stationary posterior distribution.
- Be careful when assigning proposals because acceptance ratios may have
to be amended using Jacobians. Please see an example involving a pair
of numbers.
Proposals are named according to what they do, i.e., how they change the
state of a Markov chain, and not according to the intrinsically used
probability distributions. For example, slideSymmetric
is a sliding
proposal. Under the hood, it uses the normal distribution with mean zero
and given variance. The sampled variate is added to the current value of
the variable (hence, the name slide). The same nomenclature is used by
RevBayes [4]. The probability distributions and intrinsic properties of a
specific proposal are specified in detail in the documentation.
The other method, which is used intrinsically, is more systematic, but
also a little bit more complicated: we separate between the proposal
distribution and how the state is affected. And here, I am referring to
the operator (addition, multiplication, any other binary operator). For
example, the sliding proposal with mean m
, standard deviation s
, and
tuning parameter t
is implemented as
slide :: Double -> Double -> Double -> PFunction Double slide m s t = genericContinuous (normalDistr m (s * t)) (+) (Just negate) Nothing
This specification is more involved. Especially since we need to know the
probability of jumping back, and so we need to know the inverse operator
negate
. However, it also allows specification of new proposals with
great ease.
[4]
Höhna, S., Landis, M. J., Heath, T. A., Boussau, B., Lartillot, N.,
Moore, B. R., Huelsenbeck, J. P., …, Revbayes: bayesian phylogenetic
inference using graphical models and an interactive model-specification
language, Systematic Biology, 65(4), 726–736 (2016).
http://dx.doi.org/10.1093/sysbio/syw021
Proposal name.
The positive weight determines how often a Proposal
is executed per
iteration of the Markov chain. Abstract data type; for construction, see
pWeight
.
pWeight :: Int -> PWeight Source #
Check if the weight is positive.
Call error
if weight is zero or negative.
A Proposal
is an instruction about how the Markov chain will traverse the
state space a
. Essentially, it is a probability mass or probability density
conditioned on the current state (i.e., a Markov kernel).
A Proposal
may be tuneable in that it contains information about how to
enlarge or shrink the proposal size to decrease or increase the acceptance
rate.
Predefined proposals are provided. To create custom proposals, one may use
the convenience function createProposal
.
Instances
Eq (Proposal a) Source # | |
Ord (Proposal a) Source # | |
(@~) :: Lens' b a -> Proposal a -> Proposal b infixl 7 Source #
Lift a proposal from one data type to another.
Assume the Jacobian is 1.0.
For example:
scaleFirstEntryOfTuple = _1 @~ scale
See also liftProposalWith
.
liftProposalWith :: JacobianFunction b -> Lens' b a -> Proposal a -> Proposal b Source #
Lift a proposal from one data type to another.
A function to calculate the Jacobian has to be provided. If the Jabobian is
1.0, use liftProposal
.
For further reference, please see the example
Pair
.
Tune proposal?
scale :: Shape Double -> Scale Double -> PName -> PWeight -> Tune -> Proposal Double Source #
Multiplicative proposal.
The gamma distribution is used to sample the multiplier. Therefore, this and all derived proposals are log-additive in that they do not change the sign of the state. Further, the value zero is never proposed when having a strictly positive value.
Consider using slide
to allow proposition of values
having opposite sign.
scaleUnbiased :: Shape Double -> PName -> PWeight -> Tune -> Proposal Double Source #
See scale
.
The scale of the gamma distribution is set to (shape)^{-1}, so that the mean of the gamma distribution is 1.0.
scaleContrarily :: Shape Double -> Scale Double -> PName -> PWeight -> Tune -> Proposal (Double, Double) Source #
See scale
.
The two values are scaled contrarily so that their product stays constant. Contrary proposals are useful when parameters are confounded.
scaleBactrian :: SpikeParameter -> StandardDeviation Double -> PName -> PWeight -> Tune -> Proposal Double Source #
Multiplicative proposal with kernel similar to the silhouette of a Bactrian camel.
See scale
, and slideBactrian
.
slide :: Mean Double -> StandardDeviation Double -> PName -> PWeight -> Tune -> Proposal Double Source #
Additive proposal.
A normal distribution is used to sample the addend.
slideSymmetric :: StandardDeviation Double -> PName -> PWeight -> Tune -> Proposal Double Source #
See slide
.
Use a normal distribution with mean zero. This proposal is fast, because the Metropolis-Hastings-Green ratio does not include calculation of the forwards and backwards kernels.
slideUniformSymmetric :: Size -> PName -> PWeight -> Tune -> Proposal Double Source #
See slide
.
Use a uniformly distributed kernel with mean zero. This proposal is fast, because the Metropolis-Hastings-Green ratio does not include calculation of the forwards and backwards kernels.
slideContrarily :: Mean Double -> StandardDeviation Double -> PName -> PWeight -> Tune -> Proposal (Double, Double) Source #
See slide
.
Use a normally distributed kernel.
The two values are slid contrarily so that their sum stays constant. Contrary proposals are useful when parameters are confounded.
slideBactrian :: SpikeParameter -> StandardDeviation Double -> PName -> PWeight -> Tune -> Proposal Double Source #
Additive symmetric proposal with kernel similar to the silhouette of a Bactrian camel.
The Bactrian kernel is a mixture of two symmetrically arranged normal distributions. The spike parameter \(m \in (0, 1)\) loosely determines the standard deviations of the individual humps while the second parameter \(s > 0\) refers to the standard deviation of the complete Bactrian kernel.
module Mcmc.Proposal.Simplex
Cycles
In brief, a Cycle
is a list of proposals.
The state of the Markov chain will be logged only after all Proposal
s in
the Cycle
have been completed, and the iteration counter will be increased
by one. The order in which the Proposal
s are executed is specified by
Order
. The default is RandomO
.
No proposals with the same name and description are allowed in a Cycle
, so
that they can be uniquely identified.
cycleFromList :: [Proposal a] -> Cycle a Source #
Define the order in which Proposal
s are executed in a Cycle
. The total
number of Proposal
s per Cycle
may differ between Order
s (e.g., compare
RandomO
and RandomReversibleO
).
RandomO | Shuffle the |
SequentialO | The |
RandomReversibleO | Similar to |
SequentialReversibleO | Similar to |
Settings
module Mcmc.Settings
Monitors
A Monitor
describes which part of the Markov chain should be logged
and where. Monitor files can directly be loaded into established MCMC
analysis tools working with tab separated tables (for example,
Tracer).
There are three different Monitor
types:
MonitorStdOut
- Log to standard output.
MonitorFile
- Log to a file.
MonitorBatch
- Log summary statistics such as the mean of the last states to a file.
A Monitor
observing the chain.
A Monitor
describes which part of the Markov chain should be logged and
where. Further, they allow output of summary statistics per iteration in a
flexible way.
Monitor (MonitorStdOut a) [MonitorFile a] [MonitorBatch a] |
data MonitorStdOut a Source #
Monitor to standard output; construct with monitorStdOut
.
monitorStdOut :: [MonitorParameter a] -> Period -> MonitorStdOut a Source #
Monitor to standard output.
data MonitorFile a Source #
Monitor to a file; constructed with monitorFile
.
:: String | Name; used as part of the file name. |
-> [MonitorParameter a] | |
-> Period | |
-> MonitorFile a |
Monitor parameters to a file.
data MonitorBatch a Source #
Batch monitor to a file.
Calculate summary statistics over the last given number of iterations (batch
size). Construct with monitorBatch
.
:: String | Name; used as part of the file name. |
-> [MonitorParameterBatch a] | |
-> BatchSize | |
-> MonitorBatch a |
Batch monitor parameters to a file, see MonitorBatch
.
simpleMonitor :: Period -> Monitor a Source #
Only monitor prior, likelihood and posterior functions with given period. Do not monitor parameters.
module Mcmc.Monitor.Parameter
module Mcmc.Monitor.ParameterBatch
Priors, likelihoods, Jacobians, and posteriors
module Mcmc.Prior
module Mcmc.Likelihood
module Mcmc.Jacobian
module Mcmc.Posterior
Run MCMC samplers
mcmcContinue :: Algorithm a => Iterations -> Settings -> a -> IO a Source #
Continue an MCMC algorithm for the given number of iterations.
Currently, it is only possible to continue MCMC algorithms that have completed successfully. This restriction is necessary, because for parallel chains, it is hardly possible to ensure all chains are synchronized when the process is killed or fails.
See:
See also settingsLoad
, mhgLoad
, and mc3Load
.
Algorithms
module Mcmc.Algorithm.MHG
module Mcmc.Algorithm.MC3
Marginal likelihood calculation
module Mcmc.MarginalLikelihood
Types used in statistics
module Mcmc.Statistics.Types
Useful re-exports
Log
-domain Float
and Double
values.
Instances
Foldable Log | |
Defined in Numeric.Log fold :: Monoid m => Log m -> m # foldMap :: Monoid m => (a -> m) -> Log a -> m # foldMap' :: Monoid m => (a -> m) -> Log a -> m # foldr :: (a -> b -> b) -> b -> Log a -> b # foldr' :: (a -> b -> b) -> b -> Log a -> b # foldl :: (b -> a -> b) -> b -> Log a -> b # foldl' :: (b -> a -> b) -> b -> Log a -> b # foldr1 :: (a -> a -> a) -> Log a -> a # foldl1 :: (a -> a -> a) -> Log a -> a # elem :: Eq a => a -> Log a -> Bool # maximum :: Ord a => Log a -> a # | |
Eq1 Log | |
Traversable Log | |
Applicative Log | |
Functor Log | |
Monad Log | |
Serial1 Log | |
Defined in Numeric.Log serializeWith :: MonadPut m => (a -> m ()) -> Log a -> m () # deserializeWith :: MonadGet m => m a -> m (Log a) # | |
Comonad Log | |
ComonadApply Log | |
Distributive Log | |
Foldable1 Log | |
Defined in Numeric.Log fold1 :: Semigroup m => Log m -> m # foldMap1 :: Semigroup m => (a -> m) -> Log a -> m # foldMap1' :: Semigroup m => (a -> m) -> Log a -> m # toNonEmpty :: Log a -> NonEmpty a # maximum :: Ord a => Log a -> a # minimum :: Ord a => Log a -> a # foldrMap1 :: (a -> b) -> (a -> b -> b) -> Log a -> b # foldlMap1' :: (a -> b) -> (b -> a -> b) -> Log a -> b # foldlMap1 :: (a -> b) -> (b -> a -> b) -> Log a -> b # foldrMap1' :: (a -> b) -> (a -> b -> b) -> Log a -> b # | |
Hashable1 Log | |
Defined in Numeric.Log | |
Apply Log | |
Bind Log | |
Extend Log | |
Traversable1 Log | |
(RealFloat a, Unbox a) => Vector Vector (Log a) | |
Defined in Numeric.Log basicUnsafeFreeze :: Mutable Vector s (Log a) -> ST s (Vector (Log a)) # basicUnsafeThaw :: Vector (Log a) -> ST s (Mutable Vector s (Log a)) # basicLength :: Vector (Log a) -> Int # basicUnsafeSlice :: Int -> Int -> Vector (Log a) -> Vector (Log a) # basicUnsafeIndexM :: Vector (Log a) -> Int -> Box (Log a) # basicUnsafeCopy :: Mutable Vector s (Log a) -> Vector (Log a) -> ST s () # | |
Unbox a => MVector MVector (Log a) | |
Defined in Numeric.Log basicLength :: MVector s (Log a) -> Int # basicUnsafeSlice :: Int -> Int -> MVector s (Log a) -> MVector s (Log a) # basicOverlaps :: MVector s (Log a) -> MVector s (Log a) -> Bool # basicUnsafeNew :: Int -> ST s (MVector s (Log a)) # basicInitialize :: MVector s (Log a) -> ST s () # basicUnsafeReplicate :: Int -> Log a -> ST s (MVector s (Log a)) # basicUnsafeRead :: MVector s (Log a) -> Int -> ST s (Log a) # basicUnsafeWrite :: MVector s (Log a) -> Int -> Log a -> ST s () # basicClear :: MVector s (Log a) -> ST s () # basicSet :: MVector s (Log a) -> Log a -> ST s () # basicUnsafeCopy :: MVector s (Log a) -> MVector s (Log a) -> ST s () # basicUnsafeMove :: MVector s (Log a) -> MVector s (Log a) -> ST s () # basicUnsafeGrow :: MVector s (Log a) -> Int -> ST s (MVector s (Log a)) # | |
Data a => Data (Log a) | |
Defined in Numeric.Log gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> Log a -> c (Log a) # gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c (Log a) # dataTypeOf :: Log a -> DataType # dataCast1 :: Typeable t => (forall d. Data d => c (t d)) -> Maybe (c (Log a)) # dataCast2 :: Typeable t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c (Log a)) # gmapT :: (forall b. Data b => b -> b) -> Log a -> Log a # gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> Log a -> r # gmapQr :: forall r r'. (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> Log a -> r # gmapQ :: (forall d. Data d => d -> u) -> Log a -> [u] # gmapQi :: Int -> (forall d. Data d => d -> u) -> Log a -> u # gmapM :: Monad m => (forall d. Data d => d -> m d) -> Log a -> m (Log a) # gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> Log a -> m (Log a) # gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> Log a -> m (Log a) # | |
Storable a => Storable (Log a) | |
RealFloat a => Monoid (Log a) | |
RealFloat a => Semigroup (Log a) | |
(RealFloat a, Enum a) => Enum (Log a) | |
RealFloat a => Floating (Log a) | |
Generic (Log a) | |
RealFloat a => Num (Log a) | |
(Floating a, Read a) => Read (Log a) | |
RealFloat a => Fractional (Log a) | |
(RealFloat a, Ord a) => Real (Log a) | |
Defined in Numeric.Log toRational :: Log a -> Rational # | |
RealFloat a => RealFrac (Log a) | |
(Floating a, Show a) => Show (Log a) | |
Binary a => Binary (Log a) | |
Serial a => Serial (Log a) | |
Defined in Numeric.Log | |
Serialize a => Serialize (Log a) | |
NFData a => NFData (Log a) | |
Defined in Numeric.Log | |
Eq a => Eq (Log a) | |
Ord a => Ord (Log a) | |
Hashable a => Hashable (Log a) | |
Defined in Numeric.Log | |
(RealFloat a, Unbox a) => Unbox (Log a) | |
Defined in Numeric.Log | |
newtype MVector s (Log a) | |
Defined in Numeric.Log | |
type Rep (Log a) | |
Defined in Numeric.Log | |
newtype Vector (Log a) | |
Defined in Numeric.Log |