# Multi Expression Programming
[![Build Status](https://travis-ci.org/masterdezign/hmep.svg?branch=master)](https://travis-ci.org/masterdezign/hmep)
You say, not enough Haskell machine learning libraries?
Here is yet another one!
## History
There exist many other Genetic Algorithm (GA) Haskell packages.
Personally I have used
[simple genetic algorithm](http://hackage.haskell.org/package/simple-genetic-algorithm-mr),
[GA](http://hackage.haskell.org/package/GA),
and [moo](http://hackage.haskell.org/package/moo) for quite a long time.
The last package was the most preferred, but the other two are
also great.
However, when I came up with this
[MEP paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.5.4352&rep=rep1&type=pdf),
to my surprise there was no MEP implementation in Haskell.
Soon I realized that existing GA packages are limited,
and it would be more efficient to implement MEP from scratch.
That is how this package was started. I also wish to say thank you
to the authors of the [moo](http://hackage.haskell.org/package/moo)
GA library, which inspired the present
[hmep](http://github.com/masterdezign/hmep) package.
## About MEP
Multi Expression Programming is a genetic programming variant encoding multiple
solutions in the same chromosome. A chromosome is a computer program.
Each gene is featuring [code reuse](https://en.wikipedia.org/wiki/Code_reuse).
How **MEP is different** from other genetic programming (GP) methods?
Consider a classical example of tree-based GP.
The number of nodes to encode `x^N`
using a binary tree is `2N-1`.
With MEP encoding, however, redundancies can be dramatically
diminished so that the
[shortest chromosome](https://github.com/masterdezign/hmep/blob/cd7b4976800d6c23ce5ebbe67f5ab5c9076229b9/test/Spec.hs#L18)
that encodes the same expression has only `N/2` nodes!
That often results in significantly reduced computational costs
when evaluating MEP chromosomes. Moreover, **all** the intermediate
solutions such as `x^(N/2)`, `x^(N/4)`, etc. are provided by the
chromosome as well.
For more details, please check http://mepx.org/papers.html and
https://en.wikipedia.org/wiki/Multi_expression_programming.
### MEP in open source
* By [Mihai Oltean](http://github.com/mepx), C++
* By [Mark Chenoweth](https://github.com/markcheno/go-mep), Go
* [Current project](https://github.com/masterdezign/hmep), Haskell
### The `hmep` Features
* **Works out of the box**. You may use one of the elaborated
[examples](https://github.com/masterdezign/hmep/blob/master/app/)
to quickly tailor to your needs.
* **Flexibility**. The [`hmep` package](https://github.com/masterdezign/hmep/)
provides adjustable and composable building blocks such as
[selection](https://hackage.haskell.org/package/hmep-0.1.0/docs/src/AI-MEP-Operators.html#binaryTournament),
[mutation](https://hackage.haskell.org/package/hmep-0.1.0/docs/src/AI-MEP-Operators.html#smoothMutation)
and [crossover](https://hackage.haskell.org/package/hmep-0.1.0/docs/src/AI-MEP-Operators.html#crossover)
[operators](https://hackage.haskell.org/package/hmep-0.1.0/docs/AI-MEP.html).
One is also free to use their own operators.
* **Versatility**. `hmep` can be applied to solve regression problems with
one or multiple outputs. It means, you can approximate unknown functions
or solve classification tasks. The only requirement is a custom
[loss function](https://github.com/masterdezign/hmep/blob/b006eb8e0ca7c0540de979631423753bf0b66750/app/Main.hs#L67).
## How to build
Use [Stack](http://haskellstack.org).
$ git clone https://github.com/masterdezign/hmep.git && cd hmep
$ stack build --install-ghc
### Example 1
Now that the package is built, run the first demo to
express `cos^2(x)` through `sin(x)`:
$ stack exec hmep-demo
Average loss in the initial population 15.268705681244962
Population 10: average loss 14.709728527360586
Population 20: average loss 13.497114190675477
Population 30: average loss 8.953185872653737
Population 40: average loss 8.953185872653737
Population 50: average loss 3.3219954564955856e-15
Interpreted expression:
v1 = sin x0
v2 = v1 * v1
result = 1 - v2
Effectively, the solution `cos^2(x) = 1 - sin^2(x)` was found.
Of course, MEP is a stochastic method, meaning that there is
no guarantee to find the globally optimal solution.
The unknown function approximation problem can be illustrated
by the following suboptimal solution for a given set of random
data points (blue crosses). This example was produced by another run of
the [same demo](app/Main.hs), after 100 generations of 100 chromosomes.
The following expression was obtained
`y(x) = 3*0.31248786462471034 - sin(sin^2(x))`.
Interestingly, the approximating function lies symmetrically
in-between the extrema of the unknown function, approximately
described by the blue crosses.
![Figure](https://github.com/masterdezign/hmep/blob/bbc2bdbac4fa3269c506455a473dddfa0e95231c/doc/Figures/cos2_approx.png)
### Example 2
A similar example is to approximate `sin(x)` using only
addition and multiplication operators, i.e. with polynomials.
$ stack exec hmep-sin-approximation
The algorithm is able to automatically figure out the
powers of `x`. That is where MEP really shines. We [calculate](app/MainSin.hs)
30 expressions represented by each chromosome with practically no
additional computational penalty. Then, we
choose the best expression. In this run, we have automatically obtained a
[seventh degree polynomial](https://github.com/masterdezign/hmep/blob/master/doc/sin_approx.py#L12)
coded by 14 genes. Pretty cool, yeah?
The result of approximation is [visualized](doc/sin_approx.py) below:
![Figure](https://github.com/masterdezign/hmep/blob/d173e96acf72e482474e657880f8bd28c40694e7/doc/Figures/sin_approx.png)
## Authors
This library is written and maintained by [Bogdan Penkovsky](http://penkovsky.com)