# What is genetic programming and how genetic programming can evolve mathematical expressions using given data

 Category: Machine Learning Tags: Evolve expression using Genetic Programming Code Files

## Introduction

Genetic programming idea is taken from biology, as name suggest a program can be improved from one generation to another. It actually starts with initial set of programs that may be handpicked or randomly generated then programs compete with each other over given input and expected output data. Top performing programs are picked, mutation and breeding is performed on them to generate next generation. Next generation compete with each other, the process goes on until perfect program is evolved.

Program as tree: In generic programming we build program using nodes, a node can be parent or leaf and program can be interpreted as tree. A parent node may do operations like if, addition, multiplication etc.

## Implementation

We are going to evolve an expression using genetic programming, suppose we have an expression Z = X^2 + 5Y + 1. I am using Jupyter Notebook to implement this. Import below methods.

```from random import random, randint, choice
from copy import deepcopy
from math import log```

Below is the function returns value of the Z

```def expression(x, y):
return x**2+5*y+1```

If we run above function with random values of x and y:

```def builddataset():
rows=[]
for i in range(200):
x=randint(0,40)
y=randint(0,40)
rows.append([x,y,expression(x,y)])
return rows```

Will get data set having values of x, y and z as below:

[[7, 20, 150],

[8, 5, 90],

[18, 9, 370],…]

The problem statement comes when we are given the above data but the expression is unknown and we are asked to evolve an expression which calculates z by substituting x, y. Let’s see how we can evolve the expression using genetic programming. Create a class fwrapper.

```class fwrapper:
def __init__(self, function, paramcount, name):
self.function = function
self.paramcount = paramcount
self.name = name```

Create a class node.

```# Node of the tree
class node:
def __init__(self, fw, children):
self.function = fw.function
self.name = fw.name
self.children = children

def evaluate(self, inp):
results = [n.evaluate(inp) for n in self.children]
return self.function(results)

def display(self, indent=0):
print((' '*indent)+self.name)
for c in self.children:
c.display(indent+1)```

A node consists a function, children and node name. you can see evaluate is a bottom up approach which first evaluates its children the evaluate itself. Two more nodes which does not have function but returns a value.

```# Parameter node returns parameter itself
class paramnode:
def __init__(self, idx):
self.idx = idx

def evaluate(self, inp):
return inp[self.idx]

def display(self, indent=0):
print('%sp%d' % (' '*indent,self.idx))```
```# Constant node returns a constant value
class constnode:
def __init__(self, v):
self.v = v

def evaluate(self, inp):
return self.v

def display(self, indent=0):
print('%s%d' % (' '*indent,self.v))```

paramnode returns one of the parameters passed and constnode returns what constant is passed to it. Lets create a tree, first we create function wrappers using lambda:

```addw = fwrapper(lambda l: l + l, 2, 'add')
subw = fwrapper(lambda l: l - l, 2, 'sub')
mulw = fwrapper(lambda l: l * l, 2, 'mul')```

Or with function definitions if function is complicated

```def iffun(l):
if(l > 0):
return l
else:
return l```
```def isgrater(l):
if(l > l):
return 1
else:
return 0```
```ifw = fwrapper(iffun, 3, 'if')
isgraterw = fwrapper(isgrater, 2, 'isgrater')```

List of function wrappers below:

`flist = [addw, subw, mulw, ifw, isgraterw]`

Now we can create a tree:

```def exampletree():
return node(ifw, [
node(isgraterw, [paramnode(0), constnode(3)]),
node(subw, [paramnode(1), constnode(2)])
])```
`exampletree().evaluate([0,3])`

Output

1

Display the tree:

`exampletree().display()`

Output

if

isgrater

p0

3

p1

5

sub

p1

2

Create a score function which can give difference between expected and actual values calculated by a tree, if function return 0 means the tree has perfect expression which produces correct results for each input.

```def scorefunction(tree, dataset):
difference = 0
for row in dataset:
z = tree.evaluate([row, row])
# Difference betwen actual and evaluated value
difference += abs(z - row)
return difference```

Now we will write a method which produces random tree.

```def makerandomtree(param_count,maxdepth=4,fun_prob=0.5,param_prob=0.6):
if random()<fun_prob and maxdepth>0:
# Choosing a random function
function=choice(flist)
# Random children of node
children=[makerandomtree(param_count,maxdepth-1,fun_prob,param_prob) for i in range(function.paramcount)]
return node(function,children)
elif random()<param_prob:
return paramnode(randint(0,param_count-1))
else:
return constnode(randint(0,10))```

Above method will generate a random tree with maximum depth of 4. Let’s create a random tree and try to generate the score.

```random_tree = makerandomtree(2)
scorefunction(random_tree, builddataset())```

Output

131214

makerandomtree can be used to create initial population, we have to mutate and breed them to build next generations so let’s create mutate and breed methods.

```def mutate(tree,param_count,probchange=0.1):
if random()<probchange:
return makerandomtree(param_count)
else:
result=deepcopy(tree)
if isinstance(tree,node):
# Changing children of input tree(mutating)
result.children=[mutate(child,param_count,probchange) for child in tree.children]
return result```

Mutation is changing a part of tree slightly. Let’s mutate random_tree previously generated and see the difference:

`random_tree.display()`

Output

mul

p0

isgrater

mul

p0

p0

5

sub

p1

p1

isgrater

p0

p1

```muttree=mutate(random_tree,2)
muttree.display()```

Output

mul

p0

isgrater

mul

p0

p0

5

sub

p1

p1

isgrater

p0

2

You can see last node of the mutated tree is different. We can generate it’s score too and if score is less than above one we can say mutated tree is better than original.

```def breed(tree1,tree2,probswap=0.7,top=1):
if random()<probswap and not top:
return deepcopy(tree2)
else:
result=deepcopy(tree1)
if isinstance(tree1,node) and isinstance(tree2,node):
# Replacing tree1 children with tree2
result.children=[breed(child,choice(tree2.children),probswap,0) for child in tree1.children]
return result```

Breeding is taking some nodes from two different trees to make a new one. In above method we take two tree and randomly take nodes from both of the trees to create new one.

Let’s create two random trees, breed them and see the difference.

```tree1 = makerandomtree(2)
tree2 = makerandomtree(2)```
`tree1.display()`

Output

isgrater

if

p0

p0

isgrater

6

7

0

if

sub

3

p0

p0

6

p1

isgrater

p0

isgrater

5

p0

p1

`tree2.display()`

Output

mul

isgrater

sub

9

sub

p1

p1

if

p0

sub

p0

p1

isgrater

p0

p1

0

```newbreed = breed(tree1, tree2)
newbreed.display()```

Output

sub

9

sub

p1

p1

sub

9

sub

p1

p1

0

We can see above tree has nodes from both of the trees. New we will write a function which can rank list of trees on a dataset:

```def getrankfunction(dataset):
def rankfunction(population):
scores=[(scorefunction(t,dataset),t) for t in population]
# Sorting based on score
scores.sort(key=lambda val: val)
return scores
return rankfunction```

The last function “evolve” which make use of all above methods. This method first generate the initial population rank them then mutate and breed the top trees to generate next generation until error score reaches zero or loop ends.

```def evolve(param_countc,population_size,rankfunction,maxgen=500,mutationrate=0.1,breedingrate=0.4,exp_prob=0.7,new_prob=0.05):
# Returns a random number, always lower numbers.
def selectindex():
return int(log(random())/log(exp_prob))
# Create a random initial population
population=[makerandomtree(param_countc) for i in range(population_size)]
for i in range(maxgen):
scores=rankfunction(population)
print(scores)
if scores==0: break
# The two best always make it
newpop=[scores,scores]
# Build the next generation
while len(newpop)<population_size:
if random()>new_prob:
newpop.append(mutate(
breed(scores[selectindex()],
scores[selectindex()],
probswap=breedingrate),
param_countc,probchange=mutationrate))
else:
# Add a random node to mix things up
newpop.append(makerandomtree(param_countc))
population=newpop
scores.display()
return scores```

Let’s run above method to generate tree which represents our expression.

```dataset = builddataset()
rankfun = getrankfunction(dataset)
tree = evolve(2,100,rankfun,mutationrate=0.2,breedingrate=0.1,exp_prob=0.7,new_prob=0.1)```

Output

52714

51097

13335

13335

8344

5967

3232

3232

.

.

5

5

0

p1

p1

p1

1

.....

You can see above when score is reduced to zero tree is finalized and printed. Let’s test our evolved expression against actual expression.

```print("Result of x={}, y={} from actual expression is: {} "
"and from evolved expression is {}".format(25, 19, expression(25, 19), tree.evaluate([25, 19])))
print("Result of x={}, y={} from actual expression is: {} "
"and from evolved expression is {}".format(20, 38, expression(20, 38), tree.evaluate([20, 38])))
print("Result of x={}, y={} from actual expression is: {} "
"and from evolved expression is {}".format(11, 35, expression(11, 35), tree.evaluate([11, 35])))
print("Result of x={}, y={} from actual expression is: {} "
"and from evolved expression is {}".format(9, 34, expression(9, 34), tree.evaluate([9, 34])))```

Output

Result of x=25, y=19 from actual expression is: 721 and from evolved expression is 721

Result of x=20, y=38 from actual expression is: 591 and from evolved expression is 591

Result of x=11, y=35 from actual expression is: 297 and from evolved expression is 297

Result of x=9, y=34 from actual expression is: 252 and from evolved expression is 252

Evolved expression’s output matches with actual expression which means the expression is accurate.

## Conclusion

We often see difference between actual expression and evolved expression because evolved expression is not simplified. Suppose actual expression is (x^2 - y^2) then evolved expression might be ((x-y)(x+y) + 5xy - 4xy - xy) which is actually same as actual. We might not end up having best expression sometimes where output is close but not exactly same, in that case increasing initial population and increase in number of generation increases chances of evolving the best expression. Executing evolve method multiple times might help too.

This was a small example of evolving expression but genetic programming can do more. We can even design games which can actually compete against each other or with a person.

 Like 1 Person Nikhil Joshi Ceo & Founder at Dotnetlovers Atricles: 139 Questions: 12 Given Best Solutions: 12 *

### Reference: Nice Article!

Like 0 People on 17 September 2019 Thank you.

Like 0 People on 17 September 2019 Good article.

Thank you

Like 0 People on 7 October 2019

Login via:   x 