# 关于理由一个小前奏

The “hot spot” or the component that consumes a lot of CPU resources today on Genetic Programming is the evaluation of each individual in order to calculate the fitness of the program tree. This evaluation is often executed on each set of parameters of the “training” set. Suppose you want to make a symbolic regression of a single expression like the Pythagoras Theorem and you have a linear space of parameters from 1.0 to 1000.0 with a step of 0.1 you have 10.000 evaluations for each individual (program tree) of your population !

# 对Python的AST本身做遗传编程

During the development of Shine, an idea happened to me, that I could use a restricted Python抽象语法树（AST）作为一个遗传编程引擎个人表示，这样做的主要优点是灵活性和重用了很多东西的可能性。Of course that a shared library written in C/C++ would be useful for a lot of Genetic Programming engines that doesn’t uses Python, but since my spare time to work on this is becoming more and more rare I started to rethink the approach and use Python and the LLVM bindings for LLVM (LLVMPY），我才发现，原来是很容易使用JIT LLVM一组有限的Python的AST的本地代码，而这也正是这篇文章将会显现。

# JIT'ing受限的Python AST

LLVM的最惊人的部分显然是经过改造，所述JIT的量，当然通过一个简单的API使用整个框架的能力（确定，不是那么简单有时）。为了简化这个例子中，我将使用任意的限制AST集合了Python AST仅支持减（ - ），加（+），乘（*）和除法（/）。

>>>进口AST >>> ASTP = ast.parse（ “2 * 7”）>>> ast.dump（ASTP）“模块（体= [Expr的（值= BinOp（左= NUM​​（N = 2），OP = MULT（），右= NUM​​（N = 7）））]）”

＃导入AST和LLVM进口* LLVM的Python绑定进口AST从llvm.core进口*从llvm.ee进口*进口llvm.passes作为LP级AstJit（ast.NodeVisitor）：DEF __init __（个体经营）：通

What we need to do now is to create an initialization method to keep the last state of the JIT visitor, this is needed because we are going to JIT the content of the Python AST into a function and the last instruction of the function needs to return what was the result of the last instruction visited by the JIT. We also need to receive a LLVM Module object in which our function will be created as well the closure type, for the sake of simplicity I’m not type any object, I’m just assuming that all numbers from the expression are integers, so the closure type will be the LLVM integer type.

DEF __init __（个体，模块，参数）：self.last_state =无self.module =模组＃参数，将在IR功能self.parameters =参数self.closure_type = Type.int（）＃的属性以保持被创建链接创建的函数＃，所以我们可以用它来JIT后self.func_obj =无self._create_builder（）高清_create_builder（个体经营）：整数类型则params的＃有多少参数= [self.closure_type] * LEN（self.parameters）＃函数的原型，返回一个整数＃和接收所述整数参数ty_func = Type.function（self.closure_type，则params）＃添加的功能名称为“func_ast_jit” self.func_obj = self.module该模块。add_function（ty_func，“func_ast_jit”）＃创建用于索引，PNAME在枚举指定的每个参数的函数的参数（self.parameters）：self.func_obj.args [索引] = .NAME＃PNAME创建一个基本块和助洗剂BB = self.func_obj.append_basic_block（ “入口”）self.builder = Builder.new（BB）

＃A“名称”是在AST生产时访问＃变量，比如“2 + X + Y”，“X”和“y”是＃对AST为表达式创建的两个名称的节点。高清visit_Name（个体经营，节点）：＃这个变量就是函数的参数？指数= self.parameters.index（node.id）self.last_state = self.func_obj.args [指数]返回self.last_state＃这里我们创建一个LLVM IR整数常量使用＃货号节点，在表达式“2 + 3“你有两个＃民节点上，NUM（N = 2）和民（N = 3）。高清visit_Num（个体经营，节点）：self.last_state = Constant.int（self.closure_type，node.n）返回self.last_state＃为DEF visit_BinOp二元运算访问者（自我，节点）：＃获取操作，左，右参数LHS = self.visit（node.left）RHS = self.visit（node.right）OP = node.op＃转换每个操作（子，添加，MULT，DIV）到其＃LLVM IR整数指令等效如果isinstance（OP，ast.Sub）：OP = self.builder.sub（左，右轴， 'sub_t'）的elif isinstance（OP，ast.Add）：OP = self.builder.add（左，右轴， 'add_t'）elif的isinstance（OP，ast.Mult）：OP = self.builder.mul（左，右轴， 'mul_t'）的elif isinstance（OP，ast.Div）：OP = self.builder.sdiv（左，右轴，“sdiv_t“）self.last_state =运回self.last_state＃建立与过去的状态返回（RET）语句高清build_return（个体经营）：self.builder.ret（self.last_state）

And that is it, our visitor is ready to convert a Python AST to a LLVM IR assembly language, to run it we’ll first create a LLVM module and an expression:

模块= Module.new（ 'ast_jit_module'）＃请注意，我使用两个变量 'A' 和 'b' EXPR =“（2 + 3 * B + 33 *（10/2）+ 1 + 3/3 +一）/ 2" 节点= ast.parse（表达式）打印ast.dump（节点）

模块（体= [Expr的（值= BinOp（左= BinOp（左= BinOp（左= BinOp（左= BinOp（左= BinOp（左= NUM​​（N = 2），OP =添加（），右= BinOp（左= NUM​​（N = 3），OP = MULT（），右=名称（ID = 'b'，CTX =负载（）））），OP =添加（），右= BinOp（左= NUM​​（N =33），OP = MULT（），右= NUM​​（N = 2））），OP =添加（），右= NUM​​（N = 1）），OP =添加（），右= NUM​​（N = 3）），OP =添加（），右=名称（ID = 'A'，CT​​X =负载（））），OP =股利（），右= NUM​​（N = 2）））]）

访问者= AstJit（模块，[ '一'， 'B']）visitor.visit（节点）visitor.build_return（）打印模块

;的moduleId = 'ast_jit_module' 限定I32 @func_ast_jit（I32％A，I32％B）{条目：％mul_t = MUL I32 3，％B％add_t =添加I32 2，％mul_t％add_t1 =添加I32％add_t，165％add_t2=添加I32％add_t1，1个％add_t3 =添加I32％add_t2，1％add_t4 =添加I32％add_t3，％A％sdiv_t = SDIV I32％add_t4，2 RET I32％sdiv_t}

PMB = lp.PassManagerBuilder.new（）＃优化级别pmb.opt_level =下午2点= lp.PassManager.new（）pmb.populate（下午）＃执行通入模块pm.run（模块）的打印模块

;的moduleId = 'ast_jit_module' 限定I32 @func_ast_jit（I32％A，I32％B）非展开readnone {条目：％mul_t = MUL I32％B，3％add_t3 =添加I32％A，169％add_t4 =添加I32％add_t3，％mul_t％sdiv_t = SDIV I32％add_t4，2 RET I32％sdiv_t}

EE = ExecutionEngine.new（模块）arg_a = GenericValue.int（Type.int（），100）= arg_b GenericValue.int（Type.int（），42）= RETVAL ee.run_function（visitor.func_obj，[arg_a，arg_b]）打印 “返回值：％d” ％retval.as_int（）

返回：197

## 邀请：PYCON美国2011 - Python中的遗传编程

Did you know you can create and evolve programs that find solutions to problems? This talk walks through how to use Genetic Algorithms and Genetic Programming as tools to discover solutions to hard problems, when to use GA/GP, setting up the GA/GP environment, and interpreting the results. Using pyevolve, we’ll walk through a real-world implementation creating a GP that predicts the weather.

（…)

GA /全科医生一直使用的问题域as diverse as scheduling, database index optimization, circuit board layout, mirror and lens design, game strategies, and robotic walking and swimming. They can also be a lot of fun, and have been used to evolve aesthetically pleasing artwork, melodies, and approximating pictures or paintings using polygons.

GA/GP is fun to play with because often-times an unexpected solution will be created that will give new insight or knowledge. It might also present a novel solution to a problem, one that a human may never generate. Solutions may also be inscrutable, and determining why a solution works is interesting in itself.

## 成功pyevolve多为加速遗传编程

from pyevolve import * import math rmse_accum = Util.ErrorAccumulator() def gp_add(a, b): return a+b def gp_sub(a, b): return a-b def gp_mul(a, b): return a*b def gp_sqrt(a): return math.sqrt(abs(a)) def eval_func(chromosome): global rmse_accum rmse_accum.reset() code_comp = chromosome.getCompiledCode() for a in xrange(0, 80): for b in xrange(0, 80): evaluated = eval(code_comp) target = math.sqrt((a*a)+(b*b)) rmse_accum += (target, evaluated) return rmse_accum.getRMSE() def main_run(): genome = GTree.GTreeGP() genome.setParams(max_depth=4, method="ramped") genome.evaluator += eval_func genome.mutator.set(Mutators.GTreeGPMutatorSubtree) ga = GSimpleGA.GSimpleGA(genome, seed=666) ga.setParams(gp_terminals = ['a', 'b'], gp_function_prefix = "gp") ga.setMinimax(Consts.minimaxType["minimize"]) ga.setGenerations(20) ga.setCrossoverRate(1.0) ga.setMutationRate(0.08) ga.setPopulationSize(800) ga.setMultiProcessing(True) ga(freq_stats=5) best = ga.bestIndividual() if __name__ == "__main__": main_run()

## 使用遗传编程逼近丕号

最好的（0）：3.1577998365错误：0.0162071829最好（10）：3.1417973679错误：0.0002047143最好的（20）：3.1417973679错误：0.0002047143最好（30）：3.1417973679错误：0.0002047143最好（40）：3.1416185511错误：0.0000258975  -  GenomeBase分数：0.000026健身：15751.020831 PARAMS：{ 'MAX_DEPTH'：8 '方法'： '倾斜'}插槽[计算器]（计数：1）时隙[Initializator]（计数：1）名称：GTreeGPInitializator  - 重量：0.50 DOC：本initializator接受后续参数：* MAX_DEPTH *树* *方法的方法的最大深度，接受“成长”或“满” .. versionadded :: 0.6 * GTreeGPInitializator *功能。时隙[的Mutator（计数：1）名称：GTreeGPMutatorSubtree  - 重量：0.50文档：GTreeGP，子树的Mutator的增变.. versionadded :: 0.6 * * GTreeGPMutatorSubtree功能插槽[交叉]（计数：1）名称：GTreeGPCrossoverSinglePoint  - 重量：0.50  -  GTree高度：8节点：21 GTreeNodeBase [童车= 1]  -  [gp_sqrt] GTreeNodeBase [童车= 2]  -  [gp_div] GTreeNodeBase [童车= 2]  -  [gp_add] GTreeNodeBase [童车= 0]  -  [26]GTreeNodeBase [童车= 2]  -  [gp_div] GTreeNodeBase [童车= 2]  -  [gp_mul] GTreeNodeBase [童车= 2]  -  [gp_add] GTreeNodeBase [童车= 2]  -  [gp_sub] GTreeNodeBase [童车= 0]  -  [34]GTreeNodeBase [童车= 2]  -  [gp_sub] GTreeNodeBase [童车= 0]  -  [44] GTreeNodeBase [童车= 0]  -  [1] GTreeNodeBase [童车= 2]  -  [gp_mul] GTreeNodeBase [童车= 0]  -  [49]GTreeNodeBase [童车= 0]  -  [43] GTreeNodeBase [童车= 1]  -  [gp_sqrt] GTreeNodeBase [童车= 0]  -  [18] GTreeNodeBase [童车= 0]  -  [16] GTreeNodeBase [童车= 2]  -  [gp_add]GTreeNodeBase [童车= 0]  -  [24] GTreeNodeBase [童车= 0]  -  [35]  -  GTReeGP表达式：gp_sqrt（gp_div（gp_add（26，gp_div（gp_mul（gp_add（gp_sub（34，gp_sub（44，1）），gp_mul（49，43）），gp_sqrt（18）），16）），gp_add（24，35）））

from __future__ import division from pyevolve import * import math def gp_add(a, b): return a+b def gp_sub(a, b): return a-b def gp_div(a, b): return 1 if b==0 else a/b def gp_mul(a, b): return a*b def gp_sqrt(a): return math.sqrt(abs(a)) def eval_func(chromosome): code_comp = chromosome.getCompiledCode() ret = eval(code_comp) return abs(math.pi - ret) def step_callback(engine): gen = engine.getCurrentGeneration() if gen % 10 == 0: best = engine.bestIndividual() best_pi = eval(best.getCompiledCode()) print "Best (%d): %.10f" % (gen, best_pi) print "\tError: %.10f" % (abs(math.pi - best_pi)) return False def main_run(): genome = GTree.GTreeGP() genome.setParams(max_depth=8, method="ramped") genome.evaluator += eval_func ga = GSimpleGA.GSimpleGA(genome) ga.setParams(gp_terminals = ['ephemeral:random.randint(1, 50)'], gp_function_prefix = "gp") ga.setMinimax(Consts.minimaxType["minimize"]) ga.setGenerations(50000) ga.setCrossoverRate(1.0) ga.setMutationRate(0.09) ga.setPopulationSize(1000) ga.stepCallback.set(step_callback) ga.evolve() best = ga.bestIndividual() best.writeDotImage("tree_pi.png") print best if __name__ == "__main__": main_run()

## 遗传编程和Flex布局

import random from pyevolve import * def gp_hbox(x, y): return "%s %s" % (x,y) def gp_vbox(x, y): return "%s %s" % (x,y) def gp_panel(x, y): return "%s %s" % (x,y) def eval_func(chromosome): code_comp = chromosome.getCompiledCode() for a in xrange(0, 5): for b in xrange(0, 5): evaluated = eval(code_comp) return random.randint(1,100) def main_run(): genome = GTree.GTreeGP() genome.setParams(max_depth=5, method="ramped") genome.evaluator += eval_func ga = GSimpleGA.GSimpleGA(genome) button = repr("") label = repr("") text_input = repr("") ga.setParams(gp_terminals = [button, label, text_input], gp_function_prefix = "gp") ga.setMinimax(Consts.minimaxType["minimize"]) ga.evolve(freq_stats=5) print ga.bestIndividual() if __name__ == "__main__": main_run()

As you can see, I’ve created the layout tags like HBox, VBox and Panel as functions of GP and the Button, Labe, TextInput as terminals of the GP, the result is very funny, it’s just a random layout, but you can use your imagination to create some nice and interesting fitness functions.