rl-satton-0.1.2.4: Collection of Reinforcement Learning algorithms

Safe HaskellNone
LanguageHaskell2010

RL.MC

Synopsis

Documentation

type Q s a = M s a MC_Number Source #

q2v :: (Bounded a, Enum a, Eq a, Hashable a, Eq s, Hashable s) => Q s a -> V s Source #

diffV :: (Eq s, Hashable s) => V s -> V s -> MC_Number Source #

toV :: (Bounded a, Enum a, Eq a, Hashable a, Eq s, Hashable s) => Q s a -> V s Source #

class (Fractional num, Ord s, Ord a, Show s, Show a, Bounded a, Enum a) => MC_Problem pr s a num | pr -> s, pr -> a, pr -> num where Source #

Minimal complete definition

mc_is_terminal, mc_reward

Methods

mc_is_terminal :: pr -> s -> Bool Source #

mc_reward :: pr -> s -> a -> s -> num Source #

queryQ :: (Hashable s, Hashable k, MonadState (M s k v) f, Eq s, Eq k, Enum k, Bounded k) => s -> f [(k, v)] Source #

modifyQ :: (Hashable a, Hashable s, MonadState (M s a num) m, Eq a, Eq s, Enum a, Bounded a) => s -> a -> (num -> num) -> m () Source #

data MC pr m s a Source #

Constructors

MC 

Fields

mc_es_learn :: (Monad m, Hashable s, Hashable a, MC_Problem pr s a MC_Number) => MC_Opts -> Q s a -> s -> a -> MC pr m s a -> m (Q s a) Source #

MC-ES learning algorithm, pg 5.4. Alpha-learing rate is used instead of total averaging, maximum episode length is limited to make sure policy it terminates