Compiling C++ code dynamically at runtime under Linux

Mar 2021
C/C++ do not have any inbuilt functionality to dynamically compile and run code. Yet, it can still be achieved although it will not be portable code. See also stackoverflow dynamic function creation in C++. The examples on this page will only work under Linux and some only on an AMD/Intel x86-64 CPU.

Write own machine code

In principle this is straight forward:

A minimalist example

The below example calculates the square of an input number.
// sqr.cpp
#include <cstdlib>          // EXIT_FAILURE etc
#include <cstdio>           // printf(), fopen() etc
#include <cstring>          // memcpy()
#include <sys/mman.h>       // mmap()

int main(int argc, char** argv)
{
    // machine code
    unsigned char opcode[] = {
        0xf2, 0x0f, 0x59, 0xc0,         // mulsd  xmm0,xmm0
        0xc3                            // ret
    };
    // allocate memory which allows code execution
    // https://en.wikipedia.org/wiki/NX_bit
    void* codelocation = mmap(NULL,sizeof(opcode),
			      PROT_READ|PROT_WRITE|PROT_EXEC,
                              MAP_PRIVATE|MAP_ANON,-1,0);
    // copy machine code to executable memory location
    memcpy(codelocation, opcode, sizeof(opcode));
    // function pointer to point to that memory location
    double (*myfunc)(double);
    myfunc = reinterpret_cast<double(*)(double)>(codelocation);

    // read command line arguments and execute myfunc()
    double x = 0.0;
    if(argc>1)
        x = atof(argv[1]);
    double y = myfunc(x);
    printf("f(%f)=%f\n", x,y);
    return EXIT_SUCCESS;
}
Compile and execute as follows:
$ g++ -O2 -Wall sqr.cpp -o sqr
$ ./sqr 7
f(7.000000)=49.000000

Debugging

It is easy to make mistakes writing machine code so we need to be able to disassemble it. We can do this dynamically by writing the code buffer into an output file and call objdump on it.
void print_asm(const void* buf, size_t n)
{
    FILE* fp = fopen("/tmp/opcode.bin", "w");
    if(fp!=NULL) {
        fwrite(buf, n, 1, fp);
        fclose(fp);
    }
    system("objdump -D -M intel -b binary -mi386 /tmp/opcode.bin");
}
The example below calculates the harmonic series (∑ 1i) using a function f() written in C++ and a function myfunc() written in machine code. It can also print its own code at runtime. Compile and execute as follows:
$ g++ -Wall -O2 harmon.cpp -o harmon
$ ./harmon 100
     f(100)=5.187378
myfunc(100)=5.187378

$ ./harmon -d

Disassembly of myfunc()
-----------------------

/tmp/opcode.bin:     file format binary


Disassembly of section .data:

00000000 <.data>:
   0:   85 ff                   test   edi,edi
   2:   7e 22                   jle    0x26
   4:   66 0f 57 c0             xorpd  xmm0,xmm0
   8:   b8 01 00 00 00          mov    eax,0x1
   d:   f2 0f 2a c8             cvtsi2sd xmm1,eax
  11:   f2 0f 2a d7             cvtsi2sd xmm2,edi
  15:   f2 0f 10 d9             movsd  xmm3,xmm1
  19:   f2 0f 5e da             divsd  xmm3,xmm2
  1d:   f2 0f 58 c3             addsd  xmm0,xmm3
  21:   83 ef 01                sub    edi,0x1
  24:   75 eb                   jne    0x11
  26:   f3 c3                   repz ret 

Disassembly of f()
------------------

/tmp/opcode.bin:     file format binary


Disassembly of section .data:

00000000 <.data>:
   0:   85 ff                   test   edi,edi
   2:   66 0f 57 c0             xorpd  xmm0,xmm0
   6:   7e 1f                   jle    0x27
   8:   f2 0f 10 15 08 03 00    movsd  xmm2,QWORD PTR ds:0x308
   f:   00 
  10:   f2 0f 2a cf             cvtsi2sd xmm1,edi
  14:   66 0f 28 da             movapd xmm3,xmm2
  18:   83 ef 01                sub    edi,0x1
  1b:   f2 0f 5e d9             divsd  xmm3,xmm1
  1f:   f2 0f 58 c3             addsd  xmm0,xmm3
  23:   75 eb                   jne    0x10
  25:   f3 c3                   repz ret 
  27:   f3 c3                   repz ret 
  29:   0f 1f 80 00 00 00 00    nop    DWORD PTR [eax+0x0]

How do I learn to write machine code?

The simplest answer is: learn from your compiler.
Since the compiler knows "everything" on how to convert the high level language into machine code it is easiest to tap into that knowledge: Online services can also translate C/C++ code into assembly/machine code, e.g.

Language parser

This is more complex but tools have been developed for language parsing, e.g. Lex and Yacc or Bison and Flex.

Let the compiler write machine code

Although it is good fun to write machine code it will be incredibly complex to dynamically generate machine code from a string of instructions in some language. It is considerably simpler to use the C/C++ compiler to do that job for us:

Simplest example

Below is a simple example without error handling:
// sqr.cpp 
#include <cstdlib>      // system(), EXIT_SUCCESS 
#include <dlfcn.h>      // dynamic library loading
#include<string>
#include <iostream>
#include <fstream>
int main(int argc, char** argv)
{
    std::string code = "extern \"C\" double myfunc(double x) { return x*x; }";
    // temporary output files
    std::string cppfile="/tmp/runtimecode.cpp";
    std::string libfile="/tmp/runtimecode.so";
    std::string logfile="/tmp/runtimecode.log";
    std::ofstream out(cppfile.c_str(), std::ofstream::out);
    out << code;
    out.close();
    // invoke external compiler
    std::string cmd = "g++ -Wall " + cppfile + " -o " + libfile
                      + " -O2 -shared -fPIC &> " + logfile;
    system(cmd.c_str());
    // dynamic library loading
    void* dynlib = dlopen (libfile.c_str(), RTLD_LAZY);
    // function pointer to symbol "myfunc" exported by the shared .so library
    double (*myfunc)(int);
    myfunc = (double(*)(int)) dlsym(dynlib, "myfunc");
    // execute
    double x=0.0;
    if(argc>1)
        x = atof(argv[1]);
    double y=(*myfunc)(x);
    std::cout << "myfunc(" << x << ") = " << y << std::endl;
    return EXIT_SUCCESS;
}
Compile and execute as follows:
$ g++ -Wall -O2 sqr.cpp -o sqr -ldl
$ ./sqr 1.5
myfunc(1.5) = 2.25
We can inspect the dynamically created shared library:
$ nm /tmp/runtimecode.so | grep myfunc
0000000000000610 T myfunc	T = in text section and global (exported)
$ objdump -d -M intel /tmp/runtimecode.so
...
0000000000000610 <myfunc>:
 610:   f2 0f 59 c0             mulsd  xmm0,xmm0
 614:   c3                      ret    
...

Simple example with error handling

With added error handling the below function reads code from std-input, dynamically compiles and executes it. Compile and run the parser as follows:
$ g++ -Wall parse.cpp -o parse -ldl
$ ./parse 100 < myfunc.txt 
compiling ...
running ...
myfunc(100) = 5.18738
Where myfunc.txt contains the calculation of the harmonic series:
$ cat myfunc.txt 
double sum=0.0;
for(int i=n; i>0; i--) {    // n is the input to this function
        sum+=1.0/(double)i;
}
return sum;

Parsing classes

Classes can also be dynamically compiled and used as described above except that we need to export a class creation function in the shared .so library:
// parse_class.cpp
#include "base.h"

    [...]

    // add necessary class maker-function
    code = code + "\n"
           + "extern \"C\" base* make_class() {\n"
           + "    return (base*) new myclass();\n"
           + "}" ;

    [...]

    // loading symbol from library and assign to function pointer
    base* (*make_class)();          // function pointer
    make_class = reinterpret_cast<base*(*)()>(dlsym(dynlib, "make_class"));

    [...]

    // execute function
    std::shared_ptr<base> f = std::shared_ptr<base>(make_class());
    double y=(*f)(x);
// base.h
#ifndef BASE_H
#define BASE_H
class base
{
public:
    virtual double operator()(double) const = 0;
};
#endif /* BASE_H */
$ g++ -Wall -std=c++11 parse_class.cpp -o parse_class -ldl
$ ./parse_class 2.5 < myclass.txt
compiling ...
running ...
myclass(2.5) = 8.25
Where myclass.txt is
$ cat myclass.txt
#include "base.h"
#include<cstdio>
class myclass : public base
{
public:
    double operator()(double x) const
    {
        return x*x + 2.0*x - 3.0;
    }
};