Introduction
Genetic programming idea is taken from biology, as name suggest a program can be improved from one generation to another. It actually starts with initial set of programs that may be handpicked or randomly generated then programs compete with each other over given input and expected output data. Top performing programs are picked, mutation and breeding is performed on them to generate next generation. Next generation compete with each other, the process goes on until perfect program is evolved.

Program as tree: In generic programming we build program using nodes, a node can be parent or leaf and program can be interpreted as tree. A parent node may do operations like if, addition, multiplication etc.

Implementation
We are going to evolve an expression using genetic programming, suppose we have an expression Z = X^2 + 5Y + 1. I am using Jupyter Notebook to implement this. Import below methods.
from random import random, randint, choice from copy import deepcopy from math import log
Below is the function returns value of the Z
def expression(x, y): return x**2+5*y+1
If we run above function with random values of x and y:
def builddataset(): rows=[] for i in range(200): x=randint(0,40) y=randint(0,40) rows.append([x,y,expression(x,y)]) return rows
Will get data set having values of x, y and z as below:
[[7, 20, 150],
[8, 5, 90],
[18, 9, 370],…]
The problem statement comes when we are given the above data but the expression is unknown and we are asked to evolve an expression which calculates z by substituting x, y. Let’s see how we can evolve the expression using genetic programming. Create a class fwrapper.
class fwrapper: def __init__(self, function, paramcount, name): self.function = function self.paramcount = paramcount self.name = name
Create a class node.
# Node of the tree class node: def __init__(self, fw, children): self.function = fw.function self.name = fw.name self.children = children def evaluate(self, inp): results = [n.evaluate(inp) for n in self.children] return self.function(results) def display(self, indent=0): print((' '*indent)+self.name) for c in self.children: c.display(indent+1)
A node consists a function, children and node name. you can see evaluate is a bottom up approach which first evaluates its children the evaluate itself. Two more nodes which does not have function but returns a value.
# Parameter node returns parameter itself class paramnode: def __init__(self, idx): self.idx = idx def evaluate(self, inp): return inp[self.idx] def display(self, indent=0): print('%sp%d' % (' '*indent,self.idx))
# Constant node returns a constant value class constnode: def __init__(self, v): self.v = v def evaluate(self, inp): return self.v def display(self, indent=0): print('%s%d' % (' '*indent,self.v))
paramnode returns one of the parameters passed and constnode returns what constant is passed to it. Lets create a tree, first we create function wrappers using lambda:
addw = fwrapper(lambda l: l[0] + l[1], 2, 'add') subw = fwrapper(lambda l: l[0] - l[1], 2, 'sub') mulw = fwrapper(lambda l: l[0] * l[1], 2, 'mul')
Or with function definitions if function is complicated
def iffun(l): if(l[0] > 0): return l[1] else: return l[2]
def isgrater(l): if(l[0] > l[1]): return 1 else: return 0
ifw = fwrapper(iffun, 3, 'if') isgraterw = fwrapper(isgrater, 2, 'isgrater')
List of function wrappers below:
flist = [addw, subw, mulw, ifw, isgraterw]
Now we can create a tree:
def exampletree(): return node(ifw, [ node(isgraterw, [paramnode(0), constnode(3)]), node(addw, [paramnode(1), constnode(5)]), node(subw, [paramnode(1), constnode(2)]) ])
exampletree().evaluate([0,3])
Output
1
Display the tree:
exampletree().display()
Output
if
isgrater
p0
3
add
p1
5
sub
p1
2
Create a score function which can give difference between expected and actual values calculated by a tree, if function return 0 means the tree has perfect expression which produces correct results for each input.
def scorefunction(tree, dataset): difference = 0 for row in dataset: z = tree.evaluate([row[0], row[1]]) # Difference betwen actual and evaluated value difference += abs(z - row[2]) return difference
Now we will write a method which produces random tree.
def makerandomtree(param_count,maxdepth=4,fun_prob=0.5,param_prob=0.6): if random()<fun_prob and maxdepth>0: # Choosing a random function function=choice(flist) # Random children of node children=[makerandomtree(param_count,maxdepth-1,fun_prob,param_prob) for i in range(function.paramcount)] return node(function,children) elif random()<param_prob: return paramnode(randint(0,param_count-1)) else: return constnode(randint(0,10))
Above method will generate a random tree with maximum depth of 4. Let’s create a random tree and try to generate the score.
random_tree = makerandomtree(2) scorefunction(random_tree, builddataset())
Output
131214
makerandomtree can be used to create initial population, we have to mutate and breed them to build next generations so let’s create mutate and breed methods.
def mutate(tree,param_count,probchange=0.1): if random()<probchange: return makerandomtree(param_count) else: result=deepcopy(tree) if isinstance(tree,node): # Changing children of input tree(mutating) result.children=[mutate(child,param_count,probchange) for child in tree.children] return result
Mutation is changing a part of tree slightly. Let’s mutate random_tree previously generated and see the difference:
random_tree.display()
Output
mul
p0
add
isgrater
mul
p0
p0
5
add
sub
p1
p1
isgrater
p0
p1
muttree=mutate(random_tree,2) muttree.display()
Output
mul
p0
add
isgrater
mul
p0
p0
5
add
sub
p1
p1
isgrater
p0
2
You can see last node of the mutated tree is different. We can generate it’s score too and if score is less than above one we can say mutated tree is better than original.
def breed(tree1,tree2,probswap=0.7,top=1): if random()<probswap and not top: return deepcopy(tree2) else: result=deepcopy(tree1) if isinstance(tree1,node) and isinstance(tree2,node): # Replacing tree1 children with tree2 result.children=[breed(child,choice(tree2.children),probswap,0) for child in tree1.children] return result
Breeding is taking some nodes from two different trees to make a new one. In above method we take two tree and randomly take nodes from both of the trees to create new one.
Let’s create two random trees, breed them and see the difference.
tree1 = makerandomtree(2) tree2 = makerandomtree(2)
tree1.display()
Output
isgrater
add
if
add
p0
p0
isgrater
6
7
0
if
sub
3
p0
p0
add
6
p1
isgrater
add
p0
isgrater
5
p0
p1
tree2.display()
Output
mul
isgrater
sub
9
sub
p1
p1
if
p0
sub
p0
p1
isgrater
p0
p1
0
newbreed = breed(tree1, tree2) newbreed.display()
Output
add
sub
9
sub
p1
p1
sub
9
sub
p1
p1
0
We can see above tree has nodes from both of the trees. New we will write a function which can rank list of trees on a dataset:
def getrankfunction(dataset): def rankfunction(population): scores=[(scorefunction(t,dataset),t) for t in population] # Sorting based on score scores.sort(key=lambda val: val[0]) return scores return rankfunction
The last function “evolve” which make use of all above methods. This method first generate the initial population rank them then mutate and breed the top trees to generate next generation until error score reaches zero or loop ends.
def evolve(param_countc,population_size,rankfunction,maxgen=500,mutationrate=0.1,breedingrate=0.4,exp_prob=0.7,new_prob=0.05): # Returns a random number, always lower numbers. def selectindex(): return int(log(random())/log(exp_prob)) # Create a random initial population population=[makerandomtree(param_countc) for i in range(population_size)] for i in range(maxgen): scores=rankfunction(population) print(scores[0][0]) if scores[0][0]==0: break # The two best always make it newpop=[scores[0][1],scores[1][1]] # Build the next generation while len(newpop)<population_size: if random()>new_prob: newpop.append(mutate( breed(scores[selectindex()][1], scores[selectindex()][1], probswap=breedingrate), param_countc,probchange=mutationrate)) else: # Add a random node to mix things up newpop.append(makerandomtree(param_countc)) population=newpop scores[0][1].display() return scores[0][1]
Let’s run above method to generate tree which represents our expression.
dataset = builddataset() rankfun = getrankfunction(dataset) tree = evolve(2,100,rankfun,mutationrate=0.2,breedingrate=0.1,exp_prob=0.7,new_prob=0.1)
Output
52714
51097
13335
13335
8344
5967
3232
3232
.
.
5
5
0
add
add
add
p1
p1
add
p1
1
.....
You can see above when score is reduced to zero tree is finalized and printed. Let’s test our evolved expression against actual expression.
print("Result of x={}, y={} from actual expression is: {} " "and from evolved expression is {}".format(25, 19, expression(25, 19), tree.evaluate([25, 19]))) print("Result of x={}, y={} from actual expression is: {} " "and from evolved expression is {}".format(20, 38, expression(20, 38), tree.evaluate([20, 38]))) print("Result of x={}, y={} from actual expression is: {} " "and from evolved expression is {}".format(11, 35, expression(11, 35), tree.evaluate([11, 35]))) print("Result of x={}, y={} from actual expression is: {} " "and from evolved expression is {}".format(9, 34, expression(9, 34), tree.evaluate([9, 34])))
Output
Result of x=25, y=19 from actual expression is: 721 and from evolved expression is 721
Result of x=20, y=38 from actual expression is: 591 and from evolved expression is 591
Result of x=11, y=35 from actual expression is: 297 and from evolved expression is 297
Result of x=9, y=34 from actual expression is: 252 and from evolved expression is 252
Evolved expression’s output matches with actual expression which means the expression is accurate.
Conclusion
We often see difference between actual expression and evolved expression because evolved expression is not simplified. Suppose actual expression is (x^2 - y^2) then evolved expression might be ((x-y)(x+y) + 5xy - 4xy - xy) which is actually same as actual. We might not end up having best expression sometimes where output is close but not exactly same, in that case increasing initial population and increase in number of generation increases chances of evolving the best expression. Executing evolve method multiple times might help too.
This was a small example of evolving expression but genetic programming can do more. We can even design games which can actually compete against each other or with a person.
Comments:
Nice Article!
Thank you.
Good article.
Thank you