## Introduction

Genetic programming idea is taken from biology, as name suggest a program can be improved from one generation to another. It actually starts with initial set of programs that may be handpicked or randomly generated then programs compete with each other over given input and expected output data. Top performing programs are picked, mutation and breeding is performed on them to generate next generation. Next generation compete with each other, the process goes on until perfect program is evolved.

**Program as tree:** In generic programming we build program using nodes, a node can be parent or leaf and program can be interpreted as tree. A parent node may do operations like if, addition, multiplication etc.

## Implementation

We are going to evolve an expression using genetic programming, suppose we have an expression Z = X^2 + 5Y + 1. **I am using Jupyter Notebook to implement this**. Import below methods.

from random import random, randint, choice from copy import deepcopy from math import log

Below is the function returns value of the Z

def expression(x, y): return x**2+5*y+1

If we run above function with random values of x and y:

def builddataset(): rows=[] for i in range(200): x=randint(0,40) y=randint(0,40) rows.append([x,y,expression(x,y)]) return rows

Will get data set having values of x, y and z as below:

[[7, 20, 150],

[8, 5, 90],

[18, 9, 370],…]

The problem statement comes when we are given the above data but the expression is unknown and we are asked to evolve an expression which calculates z by substituting x, y. Let’s see how we can evolve the expression using genetic programming. Create a class *fwrapper*.

class fwrapper: def __init__(self, function, paramcount, name): self.function = function self.paramcount = paramcount self.name = name

Create a class node.

# Node of the tree class node: def __init__(self, fw, children): self.function = fw.function self.name = fw.name self.children = children def evaluate(self, inp): results = [n.evaluate(inp) for n in self.children] return self.function(results) def display(self, indent=0): print((' '*indent)+self.name) for c in self.children: c.display(indent+1)

A node consists a function, children and node name. you can see evaluate is a bottom up approach which first evaluates its children the evaluate itself. Two more nodes which does not have function but returns a value.

# Parameter node returns parameter itself class paramnode: def __init__(self, idx): self.idx = idx def evaluate(self, inp): return inp[self.idx] def display(self, indent=0): print('%sp%d' % (' '*indent,self.idx))

# Constant node returns a constant value class constnode: def __init__(self, v): self.v = v def evaluate(self, inp): return self.v def display(self, indent=0): print('%s%d' % (' '*indent,self.v))

*paramnode* returns one of the parameters passed and *constnode* returns what constant is passed to it. Lets create a tree, first we create function wrappers using lambda:

addw = fwrapper(lambda l: l[0] + l[1], 2, 'add') subw = fwrapper(lambda l: l[0] - l[1], 2, 'sub') mulw = fwrapper(lambda l: l[0] * l[1], 2, 'mul')

Or with function definitions if function is complicated

def iffun(l): if(l[0] > 0): return l[1] else: return l[2]

def isgrater(l): if(l[0] > l[1]): return 1 else: return 0

ifw = fwrapper(iffun, 3, 'if') isgraterw = fwrapper(isgrater, 2, 'isgrater')

List of function wrappers below:

flist = [addw, subw, mulw, ifw, isgraterw]

Now we can create a tree:

def exampletree(): return node(ifw, [ node(isgraterw, [paramnode(0), constnode(3)]), node(addw, [paramnode(1), constnode(5)]), node(subw, [paramnode(1), constnode(2)]) ])

exampletree().evaluate([0,3])

**Output**

1

Display the tree:

exampletree().display()

**Output**

if

isgrater

p0

3

add

p1

5

sub

p1

2

Create a score function which can give difference between expected and actual values calculated by a tree, if function return 0 means the tree has perfect expression which produces correct results for each input.

def scorefunction(tree, dataset): difference = 0 for row in dataset: z = tree.evaluate([row[0], row[1]]) # Difference betwen actual and evaluated value difference += abs(z - row[2]) return difference

Now we will write a method which produces random tree.

def makerandomtree(param_count,maxdepth=4,fun_prob=0.5,param_prob=0.6): if random()<fun_prob and maxdepth>0: # Choosing a random function function=choice(flist) # Random children of node children=[makerandomtree(param_count,maxdepth-1,fun_prob,param_prob) for i in range(function.paramcount)] return node(function,children) elif random()<param_prob: return paramnode(randint(0,param_count-1)) else: return constnode(randint(0,10))

Above method will generate a random tree with maximum depth of 4. Let’s create a random tree and try to generate the score.

random_tree = makerandomtree(2) scorefunction(random_tree, builddataset())

**Output**

131214

*makerandomtree* can be used to create initial population, we have to mutate and breed them to build next generations so let’s create mutate and breed methods.

def mutate(tree,param_count,probchange=0.1): if random()<probchange: return makerandomtree(param_count) else: result=deepcopy(tree) if isinstance(tree,node): # Changing children of input tree(mutating) result.children=[mutate(child,param_count,probchange) for child in tree.children] return result

**Mutation is changing a part of tree slightly**. Let’s mutate *random_tree* previously generated and see the difference:

random_tree.display()

**Output**

mul

p0

add

isgrater

mul

p0

p0

5

add

sub

p1

p1

isgrater

p0

p1

muttree=mutate(random_tree,2) muttree.display()

**Output**

mul

p0

add

isgrater

mul

p0

p0

5

add

sub

p1

p1

isgrater

p0

2

You can see last node of the mutated tree is different. We can generate it’s score too and if score is less than above one we can say mutated tree is better than original.

def breed(tree1,tree2,probswap=0.7,top=1): if random()<probswap and not top: return deepcopy(tree2) else: result=deepcopy(tree1) if isinstance(tree1,node) and isinstance(tree2,node): # Replacing tree1 children with tree2 result.children=[breed(child,choice(tree2.children),probswap,0) for child in tree1.children] return result

**Breeding is taking some nodes from two different trees to make a new one**. In above method we take two tree and randomly take nodes from both of the trees to create new one.

Let’s create two random trees, breed them and see the difference.

tree1 = makerandomtree(2) tree2 = makerandomtree(2)

tree1.display()

**Output**

isgrater

add

if

add

p0

p0

isgrater

6

7

0

if

sub

3

p0

p0

add

6

p1

isgrater

add

p0

isgrater

5

p0

p1

tree2.display()

**Output**

mul

isgrater

sub

9

sub

p1

p1

if

p0

sub

p0

p1

isgrater

p0

p1

0

newbreed = breed(tree1, tree2) newbreed.display()

**Output**

add

sub

9

sub

p1

p1

sub

9

sub

p1

p1

0

We can see above tree has nodes from both of the trees. New we will write a function which can rank list of trees on a dataset:

def getrankfunction(dataset): def rankfunction(population): scores=[(scorefunction(t,dataset),t) for t in population] # Sorting based on score scores.sort(key=lambda val: val[0]) return scores return rankfunction

The last function “evolve” which make use of all above methods. This method first generate the initial population rank them then mutate and breed the top trees to generate next generation until error score reaches zero or loop ends.

def evolve(param_countc,population_size,rankfunction,maxgen=500,mutationrate=0.1,breedingrate=0.4,exp_prob=0.7,new_prob=0.05): # Returns a random number, always lower numbers. def selectindex(): return int(log(random())/log(exp_prob)) # Create a random initial population population=[makerandomtree(param_countc) for i in range(population_size)] for i in range(maxgen): scores=rankfunction(population) print(scores[0][0]) if scores[0][0]==0: break # The two best always make it newpop=[scores[0][1],scores[1][1]] # Build the next generation while len(newpop)<population_size: if random()>new_prob: newpop.append(mutate( breed(scores[selectindex()][1], scores[selectindex()][1], probswap=breedingrate), param_countc,probchange=mutationrate)) else: # Add a random node to mix things up newpop.append(makerandomtree(param_countc)) population=newpop scores[0][1].display() return scores[0][1]

Let’s run above method to generate tree which represents our expression.

dataset = builddataset() rankfun = getrankfunction(dataset) tree = evolve(2,100,rankfun,mutationrate=0.2,breedingrate=0.1,exp_prob=0.7,new_prob=0.1)

**Output**

52714

51097

13335

13335

8344

5967

3232

3232

.

.

5

5

**0**

add

add

add

p1

p1

add

p1

1

.....

You can see above when score is reduced to zero tree is finalized and printed. Let’s test our evolved expression against actual expression.

print("Result of x={}, y={} from actual expression is: {} " "and from evolved expression is {}".format(25, 19, expression(25, 19), tree.evaluate([25, 19]))) print("Result of x={}, y={} from actual expression is: {} " "and from evolved expression is {}".format(20, 38, expression(20, 38), tree.evaluate([20, 38]))) print("Result of x={}, y={} from actual expression is: {} " "and from evolved expression is {}".format(11, 35, expression(11, 35), tree.evaluate([11, 35]))) print("Result of x={}, y={} from actual expression is: {} " "and from evolved expression is {}".format(9, 34, expression(9, 34), tree.evaluate([9, 34])))

**Output**

Result of x=25, y=19 from actual expression is: 721 and from evolved expression is 721

Result of x=20, y=38 from actual expression is: 591 and from evolved expression is 591

Result of x=11, y=35 from actual expression is: 297 and from evolved expression is 297

Result of x=9, y=34 from actual expression is: 252 and from evolved expression is 252

Evolved expression’s output matches with actual expression which means the expression is accurate.

## Conclusion

We often see difference between actual expression and evolved expression because evolved expression is not simplified. Suppose actual expression is (x^2 - y^2) then evolved expression might be ((x-y)(x+y) + 5xy - 4xy - xy) which is actually same as actual. We might not end up having best expression sometimes where output is close but not exactly same, in that case increasing initial population and increase in number of generation increases chances of evolving the best expression. Executing evolve method multiple times might help too.

This was a small example of evolving expression but genetic programming can do more. We can even design games which can actually compete against each other or with a person.

## Comments:

ManjuNice Article!

Nikhil JoshiThank you.

NitinGood article.

Thank you