Safe Haskell	Safe-Infered

Bayes.Examples

Description

Examples of networks

Creating a simple network

The example function is the typical example. It is using the monad BNMonad. The goal of this monad is to offer a way of describing the network which is natural.

There are only three functions to understand inside the monad:

variable to create a discrete variable of type DV. Creating a discrete variable is using a Bounded and Enum type like for instance Bool.
proba to define the probability P(A) of a variable A
cpt to define the conditional probability table P(A | BC)

It is important to understand how the values are organized. If you define P( wet | sprinkler road) then you have to give the values in the order:

wet=False, sprinkler=False, road=False
wet=False, sprinkler=False, road=True
wet=False, sprinkler=True, road=False
wet=False, sprinkler=True, road=True

Finally, don't forget to return the discrete variables at the end of your network construction because those variables are used for making inferences.

example :: ([TDV Bool],SBN CPT)
example = runBN $ do 
    winter <- variable "winter" (t :: Bool)
    sprinkler <- variable "sprinkler" (t :: Bool) 
    wet <- variable "wet grass" (t :: Bool) 
    rain <- variable "rain" (t :: Bool) 
    road <- variable "slippery road" (t :: Bool) 
--
    proba winter ~~ [0.4,0.6]
    cpt sprinkler [winter] ~~ [0.25,0.8,0.75,0.2]
    cpt rain [winter] ~~ [0.9,0.2,0.1,0.8]
    cpt wet [sprinkler,rain] ~~ [1,0.2,0.1,0.05,0,0.8,0.9,0.95]
    cpt road [rain] ~~ [1,0.3,0,0.7]
    return [winter,sprinkler,rain,wet,road]

By default, all variables are typed (TDV Bool). TDV means Typed Discrete Variable.

In case you are mixing several types, you'll need to remove the type to build the cpt since the list can't be heterogeneous. Just use dv for this. It will convert the variable into the type DV of untyped discrete variable.

Creating truth tables

In practise, it is easy to compute the posterior of a variable because it is always possible to find a cluster containing the variable in the junction tree. But, it is more difficult to compute the posterior of a logical assertion or just a conjunction of assertions.

If a query is likely to be done often, then it may be a good idea to add a new node to the Bayesian network to represent this query. So, some functions to create truth tables are provided.

exampleLogical :: ([TDV Bool], SBN CPT)
exampleLogical = runBN $ do 
    a <- variable "a" (t :: Bool)
    b <- variable "b" (t :: Bool)
    notV <- variable "notV" (t :: Bool)
    andV <- variable "andV" (t :: Bool)
    orV <- variable "orV" (t :: Bool)
    let ta = a .==. True 
        tb = b .==. True
    logical notV ((.!.) ta)
    logical andV (ta .&. tb)
    logical orV (ta .|. tb)
    return $ [a,b,notV,andV,orV]

In the previous example, we force a type on the discrete variables DV to avoid futur errors in the instantiations. It is done through the tdv function.

But, it is also possible to use the untyped variables and write:

    logical andV ((a .==. True) .&. (b .==. True))

The goal of a Bayesian network is to factorize a big probability table because otherwise the algorithms can't process it. So, of course it is not a good idea to represent a complex logical assertion with a huge probability table. So, the logical keyword should only be used to build small tables.

If you need to encode a complex logical assertion, use logical several times to build a network representing the assertion instead of building just one node to represent it.

Noisy OR

The Noisy OR is a combination of logical tables (OR) and conditional probability tables which is often used during modeling to avoid generating big conditional probability tables.

It is easy to use:

    no <- noisyOR [(a,0.1),(b,0.2),(c,0.3)]

Each probability is the probability that a given variable has no effect (so is inhibited in the OR).

Importing a network from a Hugin file

The exampleImport function can be used to import a file in Hugin format. Only a subset of the format is supported. The function will return a mapping from node names to Discrete Variables DV. The node name is used and not the node's label. The function is also returning a simple bayesian network SBN using CPT as factors.

The implementation is using getDataFileName to find the path of the test pattern installed by cabal.

exampleImport :: IO (Map.Map String DV,SBN CPT)
exampleImport = do 
    path <- getDataFileName "cancer.net"
    r <- importBayesianGraph path
    return (runBN $ fromJust r)

Synopsis