Safe Haskell  None 

Language  Haskell2010 
Copyright Bogdan Penkovsky (c) 2017
Multiple Expression Programming
Example application: trigonometry cheating
Suppose, you forgot certain trigonometric identities.
For instance, you want to express cos^2(x) using sin(x).
No problem, set the target function cos^2(x) in the dataset
and add sin to the arithmetic set of operators {+,,*,/}
.
See app/Main.hs.
After running
$ stack build && stack exec hmepdemo
We obtain
Average loss in the initial population 15.268705681244962 Population 10: average loss 14.709728527360586 Population 20: average loss 13.497114190675477 Population 30: average loss 8.953185872653737 Population 40: average loss 8.953185872653737 Population 50: average loss 3.3219954564955856e15
The average value of 3.3e15 is close to zero, indicating that the exact expression was found!
The produced output was:
Interpreted expression: v1 = sin x0 v2 = v1 * v1 result = 1  v2
From here we can infer that
cos^2(x) = 1  v2 = 1  v1 * v1 = 1  sin^2(x).
Sweet!
 type Chromosome a = Vector (Gene a Int)
 data Gene a i
 type Population a = [Chromosome a]
 type Phenotype a = (Double, Chromosome a, Vector Int)
 data Config a = Config {}
 defaultConfig :: Config Double
 type LossFunction a = (Vector a > Vector a) > (Vector Int, Double)
 initialize :: PrimMonad m => Config Double > RandT m (Population Double)
 evaluatePopulation :: Num a => LossFunction a > Population a > Generation a
 regressionLoss1 :: (Num result, Ord result) => (b > b > result) > [(a, b)] > (Vector a > Vector b) > (Vector Int, result)
 avgLoss :: Generation Double > Double
 best :: Generation a > Phenotype a
 worst :: Generation a > Phenotype a
 evolve :: PrimMonad m => Config Double > LossFunction Double > (Chromosome Double > RandT m (Chromosome Double)) > (Chromosome Double > Chromosome Double > RandT m (Chromosome Double, Chromosome Double)) > (Generation Double > RandT m (Chromosome Double)) > Generation Double > RandT m (Generation Double)
 binaryTournament :: (PrimMonad m, Ord a) => Generation a > RandT m (Chromosome a)
 crossover :: PrimMonad m => Chromosome a > Chromosome a > RandT m (Chromosome a, Chromosome a)
 mutation3 :: PrimMonad m => Config Double > Chromosome Double > RandT m (Chromosome Double)
 smoothMutation :: PrimMonad m => Double > Config Double > Chromosome Double > RandT m (Chromosome Double)
 newChromosome :: PrimMonad m => Config Double > RandT m (Chromosome Double)
 generateCode :: Phenotype Double > String
 data RandT m a :: (* > *) > * > *
 runRandIO :: RandT IO a > IO a
Documentation
type Chromosome a = Vector (Gene a Int) Source
A chromosome is a vector of genes
Either a terminal symbol or a threeaddress code (a function and two pointers)
type Population a = [Chromosome a] Source
List of chromosomes
type Phenotype a = (Double, Chromosome a, Vector Int) Source
Loss value, chromosome, and the best expression indices vector
MEP configuration
Config  

defaultConfig :: Config Double Source
defaultConfig = Config { p'const = 0.1 , p'var = 0.4 , p'mutation = 0.1 , p'crossover = 0.9 , c'length = 50 , c'popSize = 100 , c'popN = 1 , c'ops = V.empty  < To be overridden , c'vars = 1 }
type LossFunction a = (Vector a > Vector a) > (Vector Int, Double) Source
A function to minimize.
The argument is a vector evaluation function whose input
is a vector (length c'vars
) and ouput is
a vector with a different length c'length
.
The result is a vector of the best indices and a scalar loss value.
Genetic algorithm
initialize :: PrimMonad m => Config Double > RandT m (Population Double) Source
Randomly generate a new population
evaluatePopulation :: Num a => LossFunction a > Population a > Generation a Source
Using LossFunction
, find how fit is each chromosome in the population
:: (Num result, Ord result)  
=> (b > b > result)  Distance function 
> [(a, b)]  Dataset 
> (Vector a > Vector b)  Chromosome evaluation function (partially applied 
> (Vector Int, result) 
Loss function for regression problems with one input and one output. Not normalized with respect to the dataset size.
avgLoss :: Generation Double > Double Source
Average population loss
best :: Generation a > Phenotype a Source
The best phenotype in the generation
worst :: Generation a > Phenotype a Source
The worst phenotype in the generation
:: PrimMonad m  
=> Config Double  Common configuration 
> LossFunction Double  Custom loss function 
> (Chromosome Double > RandT m (Chromosome Double))  Mutation 
> (Chromosome Double > Chromosome Double > RandT m (Chromosome Double, Chromosome Double))  Crossover 
> (Generation Double > RandT m (Chromosome Double))  A chromosome selection algorithm. Does not need to be random, but may be. 
> Generation Double  Evaluated population 
> RandT m (Generation Double)  New generation 
Selection operator that produces the next evaluated population.
Standard algorithm: the best offspring O replaces the worst individual W in the current population if O is better than W.
binaryTournament :: (PrimMonad m, Ord a) => Generation a > RandT m (Chromosome a) Source
Binary tournament selection
crossover :: PrimMonad m => Chromosome a > Chromosome a > RandT m (Chromosome a, Chromosome a) Source
Uniform crossover operator
:: PrimMonad m  
=> Config Double  Common configuration 
> Chromosome Double  
> RandT m (Chromosome Double) 
Mutation operator with up to three mutations per chromosome
:: PrimMonad m  
=> Double  Probability of gene mutation 
> Config Double  Common configuration 
> Chromosome Double  
> RandT m (Chromosome Double) 
Mutation operator with a fixed mutation probability of each gene
:: PrimMonad m  
=> Config Double  Common configuration 
> RandT m (Chromosome Double) 
Randomly initialize a new chromosome. By definition, the first gene is terminal (a constant or a variable).
Expression interpretation
generateCode :: Phenotype Double > String Source
Generate code for the functions with a single output
Random
data RandT m a :: (* > *) > * > *