关于理由一个小前奏
所以,我正在写在C / C象征主义回归机++叫闪耀,其目的是对遗传编程文库的JIT(像Pyevolvefor instance). The main rationale behind Shine is that we have today a lot of research on speeding Genetic Programming using GPUs (the GPU fever !) or any other special hardware, etc, however we don’t have many papers talking about optimizing GP using the state of art compilers optimizations like we have on clang, gcc, etc.
The “hot spot” or the component that consumes a lot of CPU resources today on Genetic Programming is the evaluation of each individual in order to calculate the fitness of the program tree. This evaluation is often executed on each set of parameters of the “training” set. Suppose you want to make a symbolic regression of a single expression like the Pythagoras Theorem and you have a linear space of parameters from 1.0 to 1000.0 with a step of 0.1 you have 10.000 evaluations for each individual (program tree) of your population !
什么服务所做的是下面的图片描述:
它采用遗传编程发动机的个体,然后将其转换为LLVM中间表示(LLVM汇编语言),之后它运行的改造经过LLVM的(这里是现代编译器的真正力量在GP背景下进入),然后将LLVM JIT优化的LLVM IR转换为本地代码的指定目标(X86和PowerPC等)。
你可以看到服务的体系结构如下:
这种架构带来了遗传规划了很大的灵活性,可以为可能后来由LLVM支持的任何语言你的个人使用情况下写的功能,哪些事项服务是LLVM IR,你可以使用任何语言,LLVM支持然后使用由LLVM产生的IR,可以从C,C ++,Ada的,FORTRAN,d,等混合代码,并使用自己的函数作为遗传规划树的非末端节点。
服务仍是其发展较早,它看起来简单的想法,但我仍然有很多问题需要解决,像对JIT评估过程本身,而不是做呼叫从Python的使用JIT编译的树木ctypes的绑定。
对Python的AST本身做遗传编程
During the development of Shine, an idea happened to me, that I could use a restricted Python抽象语法树(AST)作为一个遗传编程引擎个人表示,这样做的主要优点是灵活性和重用了很多东西的可能性。Of course that a shared library written in C/C++ would be useful for a lot of Genetic Programming engines that doesn’t uses Python, but since my spare time to work on this is becoming more and more rare I started to rethink the approach and use Python and the LLVM bindings for LLVM (LLVMPY),我才发现,原来是很容易使用JIT LLVM一组有限的Python的AST的本地代码,而这也正是这篇文章将会显现。
JIT'ing受限的Python AST
LLVM的最惊人的部分显然是经过改造,所述JIT的量,当然通过一个简单的API使用整个框架的能力(确定,不是那么简单有时)。为了简化这个例子中,我将使用任意的限制AST集合了Python AST仅支持减( - ),加(+),乘(*)和除法(/)。
要了解Python的AST,你可以使用Python解析器,转换成源AST:
>>>进口AST >>> ASTP = ast.parse( “2 * 7”)>>> ast.dump(ASTP)“模块(体= [Expr的(值= BinOp(左= NUM(N = 2),OP = MULT(),右= NUM(N = 7)))])”
什么是解析创建了包含抽象语法树BinOp(二元运算) with the left operator as the number 2, the right operator as the number 7 and the operation itself as乘法(多重),很容易亚洲金博宝理解。我们现在要做的创建LLVM IR是创建将访问树的每个节点的访问者。要做到这一点,我们也可以继承了PythonNodeVisitorClass from theAST模块。What the NodeVisitor does is to visit each node of the tree and then call the method ‘visit_OPERATOR’ if it exists, when the NodeVisitor is going to visit the node for the BinOp for example, it will call the method ‘visit_BinOp’ passing as parameter the BinOp node itself.
在类的JIT游客将看起来像下面的代码的结构:
#导入AST和LLVM进口* LLVM的Python绑定进口AST从llvm.core进口*从llvm.ee进口*进口llvm.passes作为LP级AstJit(ast.NodeVisitor):DEF __init __(个体经营):通
What we need to do now is to create an initialization method to keep the last state of the JIT visitor, this is needed because we are going to JIT the content of the Python AST into a function and the last instruction of the function needs to return what was the result of the last instruction visited by the JIT. We also need to receive a LLVM Module object in which our function will be created as well the closure type, for the sake of simplicity I’m not type any object, I’m just assuming that all numbers from the expression are integers, so the closure type will be the LLVM integer type.
DEF __init __(个体,模块,参数):self.last_state =无self.module =模组#参数,将在IR功能self.parameters =参数self.closure_type = Type.int()#的属性以保持被创建链接创建的函数#,所以我们可以用它来JIT后self.func_obj =无self._create_builder()高清_create_builder(个体经营):整数类型则params的#有多少参数= [self.closure_type] * LEN(self.parameters)#函数的原型,返回一个整数#和接收所述整数参数ty_func = Type.function(self.closure_type,则params)#添加的功能名称为“func_ast_jit” self.func_obj = self.module该模块。add_function(ty_func,“func_ast_jit”)#创建用于索引,PNAME在枚举指定的每个参数的函数的参数(self.parameters):self.func_obj.args [索引] = .NAME#PNAME创建一个基本块和助洗剂BB = self.func_obj.append_basic_block( “入口”)self.builder = Builder.new(BB)
现在,我们需要对我们的客人实行什么是对的“visit_OPERATOR”方法BinOp并为民和名称operators. We will also implement the method to create the return instruction that will return the last state.
#A“名称”是在AST生产时访问#变量,比如“2 + X + Y”,“X”和“y”是#对AST为表达式创建的两个名称的节点。高清visit_Name(个体经营,节点):#这个变量就是函数的参数?指数= self.parameters.index(node.id)self.last_state = self.func_obj.args [指数]返回self.last_state#这里我们创建一个LLVM IR整数常量使用#货号节点,在表达式“2 + 3“你有两个#民节点上,NUM(N = 2)和民(N = 3)。高清visit_Num(个体经营,节点):self.last_state = Constant.int(self.closure_type,node.n)返回self.last_state#为DEF visit_BinOp二元运算访问者(自我,节点):#获取操作,左,右参数LHS = self.visit(node.left)RHS = self.visit(node.right)OP = node.op#转换每个操作(子,添加,MULT,DIV)到其#LLVM IR整数指令等效如果isinstance(OP,ast.Sub):OP = self.builder.sub(左,右轴, 'sub_t')的elif isinstance(OP,ast.Add):OP = self.builder.add(左,右轴, 'add_t')elif的isinstance(OP,ast.Mult):OP = self.builder.mul(左,右轴, 'mul_t')的elif isinstance(OP,ast.Div):OP = self.builder.sdiv(左,右轴,“sdiv_t“)self.last_state =运回self.last_state#建立与过去的状态返回(RET)语句高清build_return(个体经营):self.builder.ret(self.last_state)
And that is it, our visitor is ready to convert a Python AST to a LLVM IR assembly language, to run it we’ll first create a LLVM module and an expression:
模块= Module.new( 'ast_jit_module')#请注意,我使用两个变量 'A' 和 'b' EXPR =“(2 + 3 * B + 33 *(10/2)+ 1 + 3/3 +一)/ 2" 节点= ast.parse(表达式)打印ast.dump(节点)
将输出:
模块(体= [Expr的(值= BinOp(左= BinOp(左= BinOp(左= BinOp(左= BinOp(左= BinOp(左= NUM(N = 2),OP =添加(),右= BinOp(左= NUM(N = 3),OP = MULT(),右=名称(ID = 'b',CTX =负载()))),OP =添加(),右= BinOp(左= NUM(N =33),OP = MULT(),右= NUM(N = 2))),OP =添加(),右= NUM(N = 1)),OP =添加(),右= NUM(N = 3)),OP =添加(),右=名称(ID = 'A',CTX =负载())),OP =股利(),右= NUM(N = 2)))])
现在,我们终于可以对生成AST运行我们的访问者检查LLVM IR输出:
访问者= AstJit(模块,[ '一', 'B'])visitor.visit(节点)visitor.build_return()打印模块
将输出LLVM IR:
;的moduleId = 'ast_jit_module' 限定I32 @func_ast_jit(I32%A,I32%B){条目:%mul_t = MUL I32 3,%B%add_t =添加I32 2,%mul_t%add_t1 =添加I32%add_t,165%add_t2=添加I32%add_t1,1个%add_t3 =添加I32%add_t2,1%add_t4 =添加I32%add_t3,%A%sdiv_t = SDIV I32%add_t4,2 RET I32%sdiv_t}
现在是真正的乐趣开始的时候,我们要运行LLVM优化过程具有同等GCC -02优化级别来优化我们的代码,要做到这一点,我们创建一个PassManagerBuilder和PassManager的PassManagerBuilder是增加了通行证组件PassManager,您也可以手动添加像死代码消除,内联函数等任意的变换:
PMB = lp.PassManagerBuilder.new()#优化级别pmb.opt_level =下午2点= lp.PassManager.new()pmb.populate(下午)#执行通入模块pm.run(模块)的打印模块
将输出:
;的moduleId = 'ast_jit_module' 限定I32 @func_ast_jit(I32%A,I32%B)非展开readnone {条目:%mul_t = MUL I32%B,3%add_t3 =添加I32%A,169%add_t4 =添加I32%add_t3,%mul_t%sdiv_t = SDIV I32%add_t4,2 RET I32%sdiv_t}
在这里,我们拥有了Python AST表达的优化的LLVM IR。下一步骤是将其JIT IR为本地代码,然后与一些参数执行它:
EE = ExecutionEngine.new(模块)arg_a = GenericValue.int(Type.int(),100)= arg_b GenericValue.int(Type.int(),42)= RETVAL ee.run_function(visitor.func_obj,[arg_a,arg_b])打印 “返回值:%d” %retval.as_int()
将输出:
返回:197
就是这样,你已经创建了一个AST-> LLVM IR转换器,优化了LLVM IR与改造通行证,然后使用LLVM执行引擎它转换为本地代码。我希望你喜欢=)