Copyright | (c) Dominik Schrempf 2020 |
---|---|
License | GPL-3.0-or-later |
Maintainer | dominik.schrempf@gmail.com |
Stability | unstable |
Portability | portable |
Safe Haskell | None |
Language | Haskell2010 |
Creation date: Tue May 5 18:01:15 2020.
A short introduction to update mechanisms using the Metropolis-Hastings algorithm (see Geyer, C. J., 2011; Introduction to Markov Chain Monte Carlo. In Handbook of Markov Chain Monte Carlo (pp. 45), Chapman & Hall/CRC).
For examples, please see mcmc-examples.
The import of this module alone should cover most use cases.
Synopsis
- data Proposal a
- (@~) :: Lens' b a -> Proposal a -> Proposal b
- scale :: String -> Int -> Double -> Double -> Bool -> Proposal Double
- scaleUnbiased :: String -> Int -> Double -> Bool -> Proposal Double
- slide :: String -> Int -> Double -> Double -> Bool -> Proposal Double
- slideBactrian :: String -> Int -> Double -> Double -> Bool -> Proposal Double
- slideSymmetric :: String -> Int -> Double -> Bool -> Proposal Double
- slideUniform :: String -> Int -> Double -> Bool -> Proposal Double
- data Cycle a
- fromList :: [Proposal a] -> Cycle a
- data Order
- setOrder :: Order -> Cycle a -> Cycle a
- status :: String -> (a -> Log Double) -> (a -> Log Double) -> Cycle a -> Monitor a -> a -> Maybe Int -> Maybe Int -> Int -> GenIO -> Status a
- saveWith :: Int -> Status a -> Status a
- force :: Status a -> Status a
- quiet :: Status a -> Status a
- debug :: Status a -> Status a
- data Monitor a = Monitor (MonitorStdOut a) [MonitorFile a] [MonitorBatch a]
- data MonitorStdOut a
- monitorStdOut :: [MonitorParameter a] -> Int -> MonitorStdOut a
- data MonitorFile a
- monitorFile :: String -> [MonitorParameter a] -> Int -> MonitorFile a
- data MonitorBatch a
- monitorBatch :: String -> [MonitorParameterBatch a] -> Int -> MonitorBatch a
- module Mcmc.Monitor.Parameter
- module Mcmc.Monitor.ParameterBatch
- mh :: ToJSON a => Status a -> IO (Status a)
- mhContinue :: ToJSON a => Int -> Status a -> IO (Status a)
- loadStatus :: FromJSON a => (a -> Log Double) -> (a -> Log Double) -> Cycle a -> Monitor a -> FilePath -> IO (Status a)
Proposals
A Proposal
is an instruction about how to advance a given Markov chain so
that it possibly reaches a new state. That is, Proposal
s specify how the
chain traverses the state space. As far as this MCMC library is
concerned, Proposal
s are elementary updates in that they cannot be
decomposed into smaller updates.
Proposal
s can be combined to form composite updates, a technique often
referred to as composition. On the other hand, mixing (used in the
sense of mixture models) is the random choice of a Proposal
(or a
composition of Proposal
s) from a given set.
The composition and mixture of Proposal
s allows specification of
nearly all MCMC algorithms involving a single chain (i.e., population
methods such as particle filters are excluded). In particular, Gibbs
samplers of all sorts can be specified using this procedure.
This library enables composition and mixture of Proposal
s via the Cycle
data type. Essentially, a Cycle
is a set of Proposal
s. The chain advances
after the completion of each Cycle
, which is called an iteration,
and the iteration counter is increased by one.
The Proposal
s in a Cycle
can be executed in the given order or in a
random sequence which allows, for example, specification of a fixed scan
Gibbs sampler, or a random sequence scan Gibbs sampler, respectively. See
Order
.
Note that it is of utter importance that the given Cycle
enables
traversal of the complete state space. Otherwise, the Markov chain will
not converge to the correct stationary posterior distribution.
Proposals are named according to what they do, i.e., how they change the
state of a Markov chain, and not according to the intrinsically used
probability distributions. For example, slideSymmetric
is a sliding
proposal. Under the hood, it uses the normal distribution with mean zero and
given variance. The sampled variate is added to the current value of the
variable (hence, the name slide). The same nomenclature is used by
RevBayes [1]. The probability distributions and intrinsic properties of a
specific proposal are specified in detail in the documentation.
The other method, which is used intrinsically, is more systematic, but also a little bit more complicated: we separate between the proposal distribution and how the state is affected. And here, I am not only referring to the accessor (i.e., the lens), but also to the operator (addition, multiplication, any other binary operator). For example, the sliding proposal (without tuning information) is implemented as
slideSimple :: Lens' a Double -> Double -> Double -> Double -> ProposalSimple a slideSimple l m s t = proposalGenericContinuous l (normalDistr m (s * t)) (+) (-)
This specification is more involved. Especially since we need to know the probability of jumping back, and so we need to know the inverse operator. However, it also allows specification of new proposals with great ease.
- 1
- Höhna, S., Landis, M. J., Heath, T. A., Boussau, B., Lartillot, N., Moore, B. R., Huelsenbeck, J. P., …, Revbayes: bayesian phylogenetic inference using graphical models and an interactive model-specification language, Systematic Biology, 65(4), 726–736 (2016). http://dx.doi.org/10.1093/sysbio/syw021
A Proposal
is an instruction about how the Markov chain will traverse the
state space a
. Essentially, it is a probability mass or probability density
conditioned on the current state (i.e., a kernel).
A Proposal
may be tuneable in that it contains information about how to enlarge
or shrink the step size to tune the acceptance ratio.
Instances
Eq (Proposal a) Source # | |
Ord (Proposal a) Source # | |
Show (Proposal a) Source # | |
(@~) :: Lens' b a -> Proposal a -> Proposal b Source #
Convert a proposal from one data type to another using a lens.
For example:
scaleFirstEntryOfTuple = scale >>> _1
Multiplicative proposal with Gamma distributed kernel.
Multiplicative proposal with Gamma distributed kernel. The scale of the Gamma distributions is set to (shape)^{-1}, so that the mean of the Gamma distribution is 1.0.
:: String | Name. |
-> Int | Weight. |
-> Double | Mean. |
-> Double | Standard deviation. |
-> Bool | Enable tuning. |
-> Proposal Double |
Additive proposal with normally distributed kernel.
:: String | Name. |
-> Int | Weight. |
-> Double | Spike parameter. |
-> Double | Standard deviation. |
-> Bool | Enable tuning. |
-> Proposal Double |
Additive symmetric proposal with kernel similar to the silhouette of a Bactrian camel. The Bactrian kernel is a mixture of two symmetrically arranged normal distributions. The spike parameter loosely determines the standard deviations of the individual humps while the other parameter refers to the standard deviation of the complete Bactrian kernel.
Additive proposal with normally distributed kernel with mean zero. This proposal is very fast, because the Metropolis-Hastings ratio does not include calculation of the forwards and backwards kernels.
Additive proposal with uniformly distributed kernel. This proposal is very fast, because the Metropolis-Hastings ratio does not include calculation of the forwards and backwards kernels.
In brief, a Cycle
is a list of proposals. The state of the Markov chain will
be logged only after all Proposal
s in the Cycle
have been completed, and the
iteration counter will be increased by one. The order in which the Proposal
s
are executed is specified by Order
. The default is RandomO
.
Proposals must have unique names, so that they can be identified.
Define the order in which Proposal
s are executed in a Cycle
. The total
number of Proposal
s per Cycle
may differ between Order
s (e.g., compare
RandomO
and RandomReversibleO
).
RandomO | Shuffle the |
SequentialO | The |
RandomReversibleO | Similar to |
SequentialReversibleO | Similar to |
Initialization
The Status
contains all information to run an MCMC chain. It is
constructed using the function status
.
The Status
of a Markov chain includes information about current state
(Item
) and iteration, the history of the chain
(Trace
), the Acceptance
ratios, and the random number
generator.
Further, the Status
includes auxiliary variables and functions such as
the prior and likelihood functions, instructions to move around the state
space (see above) and to monitor the MCMC run, as well as some auxiliary
information.
:: String | Name of the Markov chain; used as file prefix. |
-> (a -> Log Double) | The prior function. |
-> (a -> Log Double) | The likelihood function. |
-> Cycle a | A list of |
-> Monitor a | A |
-> a | The initial state in the state space |
-> Maybe Int | Number of burn in iterations; deactivate burn in with |
-> Maybe Int | Auto tuning period (only during burn in); deactivate
auto tuning with |
-> Int | Number of normal iterations excluding burn in. Note that auto tuning only happens during burn in. |
-> GenIO | A source of randomness. For reproducible runs, make sure to use a generator with the same seed. |
-> Status a |
Initialize the Status
of a Markov chain Monte Carlo run.
force :: Status a -> Status a Source #
Overwrite existing files; it is not necessary to use force
, when a chain
is continued.
quiet :: Status a -> Status a Source #
Do not print anything to standard output. Do not create log file. File monitors and batch monitors are executed normally.
Monitor
A Monitor
describes which part of the Markov chain should be logged
and where. There are three different types:
- MonitorStdOut
: Log to standard output.
- MonitorFile
: Log to a file.
- MonitorBatch
: Log summary statistics such as the mean of the last
- states to a file.
A Monitor
describes which part of the Markov chain should be logged and
where. Further, they allow output of summary statistics per iteration in a
flexible way.
Monitor (MonitorStdOut a) [MonitorFile a] [MonitorBatch a] |
data MonitorStdOut a Source #
Monitor to standard output; constructed with monitorStdOut
.
:: [MonitorParameter a] | Instructions about which parameters to log. |
-> Int | Logging period. |
-> MonitorStdOut a |
Monitor to standard output.
data MonitorFile a Source #
Monitor to a file; constructed with monitorFile
.
:: String | Name; used as part of the file name. |
-> [MonitorParameter a] | Instructions about which parameters to log. |
-> Int | Logging period. |
-> MonitorFile a |
Monitor parameters to a file.
data MonitorBatch a Source #
Monitor to a file, but calculate batch means for the given batch size;
constructed with monitorBatch
.
Batch monitors are slow at the moment because the monitored parameter has to be extracted from the state for each iteration.
:: String | Name; used as part of the file name. |
-> [MonitorParameterBatch a] | Instructions about which parameters to log and how to calculate the batch means. |
-> Int | Batch size. |
-> MonitorBatch a |
Monitor parameters to a file, see MonitorBatch
.
module Mcmc.Monitor.Parameter
module Mcmc.Monitor.ParameterBatch
Algorithms
At the moment, the library is tailored to the Metropolis-Hastings
algorithm (mh
) since it covers most use cases. However, implementation
of more algorithms is planned in the future.
Run a Markov chain for a given number of Metropolis-Hastings steps.
:: ToJSON a | |
=> Int | Additional number of Metropolis-Hastings steps. |
-> Status a | Loaded status of the Markov chain. |
-> IO (Status a) |
Continue a Markov chain for a given number of Metropolis-Hastings steps.
Misc
loadStatus :: FromJSON a => (a -> Log Double) -> (a -> Log Double) -> Cycle a -> Monitor a -> FilePath -> IO (Status a) Source #
Load a Status
from file.
Important information that cannot be saved and has to be provided again when a chain is restored: - prior function - likelihood function - cycle - monitor
To avoid incomplete continued runs, the mcmc
file is removed after load.