# Using Python as a C++ Preprocessor

This topic is 3314 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

The other day I was thinking about C++ and how it sometimes annoys me, then I thought that maybe a more powerful preprocessor could help with writing more flexible source code. The built-in C/C++ preprocessor is extremely limited in what it can do, so it cannot really help programmers to expand the language as they need it. For the fun of it I sat down and spent a few minutes writing a simple Python script as a proof-of-concept sort of test. I have mixed feelings about it. The power of it could help make a lot of things a lot easier, but it could also be horribly exploited if used improperly. Some things it might be good for: * Adding keyword and optional arguments to function calling, which would make it easier to add more arguments to functions in the future without having to change all function calls. * Creating a high level interface to lower level functionality. For example, someone could write a cross-platform GUI module that generates light-weight platform specific code. * Reducing code bloat by only having functionality that is really needed at run-time instead of also being designed to help the programmer. I was wondering what others might think about using a scripting language as a preprocessor? Like I said, I have mixed feelings about it. Here is a sample source file I used for testing:
{?
import math
import time

funcs_num = 0
include_list = []

global include_list
for i in include_list:
echo("#include <%s>\n" % i)

global include_list
global funcs_num
funcs_num += 1
if not filename in include_list:
include_list.append(filename)

def py_func(a):
global funcs_num
funcs_num += 1
echo(a)
return a

def recurse(a):
global funcs_num
funcs_num += 1
if a <= 0:
return
echo("\tcout << \"" + "+" * a + "\" << endl;\n")
recurse(a-1)
echo("\tcout << \"" + "-" * a + "\" << endl;\n")

def rgb(red, green, blue):
global funcs_num
funcs_num += 1
return (blue << 16) + (green << 8) + red

def wave(a):
global funcs_num
funcs_num += 1
for i in xrange(a):
echo("\t//"+ (21 + int(math.sin(i*0.2) * 20)) * " " + "*\n")

def some_func(a=0, b=1, c=2):
global funcs_num
funcs_num += 1
# I wonder if there is a more reusable generic way to do this.
echo("some_func(%i, %i, %i)" % (a,b,c))

def number_of_function_calls():
global funcs_num
funcs_num += 1
code("echo(funcs_num)")

def stl_container_help(type="int", key="std::string", description=""):
global funcs_num
funcs_num += 1

desc = [i.strip().lower() for i in description.split(",")]

use_key = False

if "ordered" in desc:
if "last in first out" in desc:
return "std::stack< %s >" % type

if "first in first out" in desc:
return "std::queue< %s >" % type

if "largest eliment first out" in desc:
return "std::priority_queue< %s >" % type

if "sorted by key" in desc:
use_key = True
else:
if "insert and erase in middle" in desc:
return "std::list< %s >" % type

if "insert and erase at front" in desc:
if "need to merge collections" in desc:
return "std::list< %s >" % type
else:
return "std::deque< %s >" % type
else:
if "need to find nth element" in desc:
if "size will vary wildly" in desc:
return "std::deque< %s >" % type
else:
return "std::vector< %s >" % type
else:
if "need to merge collections" in desc:
return "std::list< %s >" % type
else:
if "size will vary wildly" in desc:
return "std::deque< %s >" % type
else:
return "std::vector< %s >" % type
else:
if "need to find element by key" in desc:
use_key = True
else:
if "need to merge collections" in desc:
return "std::list< %s >" % type
else:
if "size will vary wildly" in desc:
return "std::deque< %s >" % type
else:
return "std::vector< %s >" % type

if use_key:
if "allow duplicates" in desc:
if "store key separate to value" in desc:
return "std::multi_map< %s , %s >" % (key, type)
else:
return "std::multi_set< %s , %s >" % (key, type)
else:
if "store key separate to value" in desc:
return "std::map< %s , %s >" % (key, type)
else:
return "std::set< %s , %s >" % (key, type)

?}

/*
{?echo("*" * 15)?}
Last build on: {?echo(time.asctime())?}
*/

#include <iostream>
#include <string>
// Automatically add needed include files

// Defining defined values
#define BLACK {?echo(rgb(0,0,0))?}
#define RED {?echo(rgb(255,0,0))?}
#define GREEN {?echo(rgb(0,255,0))?}
#define BLUE {?echo(rgb(0,0,255))?}
#define YELLOW {?echo(rgb(255,255,0))?}
#define CYAN {?echo(rgb(0,255,255))?}
#define MAGENTA {?echo(rgb(255,0,255))?}
#define WHITE {?echo(rgb(255,255,255))?}

using namespace std;

// This will later be used for keyword arguments
int some_func(int a, int b, int c)
{
return a + b + c;
}

int main(int argc, char** argv)
{
// This file calls roughly {?number_of_function_calls()?} python functions

// Container class selection based on programmer's needs
// (Honestly, this might be a terrible idea!)
{?tmp="insert and erase at front, need to merge collections"?}
{?container = stl_container_help(type = "std::string", description = tmp)?}
{?echo(container)?} my_container;
{?echo(container)?}::iterator my_iter;
{?tmp="need to find element by key, store key separate to value"?}
{?other_container = stl_container_help(type = "unsigned int", key = "std::string", description = tmp)?}
{?echo(other_container)?} my_other_container;
{?echo(other_container)?}::iterator my_other_iter;

// Calling simple python function
cout << {?py_func("Hello")?} << endl;

// Loop unrolling
{?
for i in xrange(5):
echo("\tcout << %i << endl;\n" % i)
?}

// Unrolling a recursive function
{?recurse(5)?}

// Basic operations
cout << {?echo(2**16)?} << endl;

// Generating code from a list
cout << "The Three Stooges are:" << endl;
{?
stooges = ["Larry", "Curly", "Moe"]
for stooge in stooges:
echo("\tcout << \"%s\" << endl;\n" % stooge)
?}

// Almost like a macro?
cout << "The integer representation of white is " << {?echo(rgb(255,255,255))?} << endl;

{?wave(50)?}

// Keyword arguments!
{?some_func(b=15)?}
{?some_func()?}
{?some_func(c=9, a=12)?}

return 0;
}


Here is the output after running the script:
/*
***************
Last build on: Thu Jan 22 17:31:11 2009
*/

#include <iostream>
#include <string>
// Automatically add needed include files
#include <list>
#include <map>

// Defining defined values
#define BLACK 0
#define RED 255
#define GREEN 65280
#define BLUE 16711680
#define YELLOW 65535
#define CYAN 16776960
#define MAGENTA 16711935
#define WHITE 16777215

using namespace std;

// This will later be used for keyword arguments
int some_func(int a, int b, int c)
{
return a + b + c;
}

int main(int argc, char** argv)
{
// This file calls roughly 25 python functions

// Container class selection based on programmer's needs
// (Honestly, this might be a terrible idea!)

std::list< std::string > my_container;
std::list< std::string >::iterator my_iter;

std::map< std::string , unsigned int > my_other_container;
std::map< std::string , unsigned int >::iterator my_other_iter;

// Calling simple python function
cout << Hello << endl;

// Loop unrolling
cout << 0 << endl;
cout << 1 << endl;
cout << 2 << endl;
cout << 3 << endl;
cout << 4 << endl;

// Unrolling a recursive function
cout << "+++++" << endl;
cout << "++++" << endl;
cout << "+++" << endl;
cout << "++" << endl;
cout << "+" << endl;
cout << "-" << endl;
cout << "--" << endl;
cout << "---" << endl;
cout << "----" << endl;
cout << "-----" << endl;

// Basic operations
cout << 65536 << endl;

// Generating code from a list
cout << "The Three Stooges are:" << endl;
cout << "Larry" << endl;
cout << "Curly" << endl;
cout << "Moe" << endl;

// Almost like a macro?
cout << "The integer representation of white is " << 16777215 << endl;

//                     *
//                        *
//                            *
//                                *
//                                   *
//                                     *
//                                       *
//                                        *
//                                        *
//                                        *
//                                       *
//                                     *
//                                  *
//                               *
//                           *
//                       *
//                    *
//                *
//             *
//         *
//      *
//    *
//  *
//  *
//  *
//  *
//    *
//      *
//         *
//            *
//                *
//                    *
//                       *
//                           *
//                              *
//                                  *
//                                    *
//                                      *
//                                        *
//                                        *
//                                        *
//                                       *
//                                      *
//                                   *
//                                *
//                             *
//                         *
//                     *
//                  *
//              *

// Keyword arguments!
some_func(0, 15, 2)
some_func(0, 1, 2)
some_func(12, 1, 9)

return 0;
}


Here's the script -- which is obviously not anywhere near a real usable state.
#!/usr/bin/env python

__script_out = ""

def echo(s):
global __script_out
__script_out += str(s)

def code(s):
echo("{?" + s + "?}")

def __main():
global __script_out

output = input_source

script = ""

another_pass = True
while another_pass:
another_pass = False
code = output
output = ""
cursor = 0
start = 0
end = 0

done = False
while not done:
rstart = code.find("{?",cursor)
rend = code.find("?}", cursor)

if (rstart != -1) and (rend != -1):
another_pass = True
start = rstart
end = rend
script = code[start+2:end]
output += code[cursor:start]
cursor = end + 2
__script_out = ""
exec(script, globals(), globals())
output += __script_out
else:
output += code[end+2:]
done = True

#print output
#print "-----"

print output

output_file = open("output.cpp", "wt")
output_file.write(output)

return 0

if __name__ == '__main__': __main()

#print(dir())


So what are any of your thoughts on the matter?

##### Share on other sites
Interesting, certainly, although I think preprocessor is the wrong word - you've essentially written a C++ generator in python.

My problem is... What's the point? :)

##### Share on other sites
Indeed, this ventures more in the realm of code-generating. Generating code is a powerful tool. I occasionally employ Python scripts to generate C++ code, although mostly data-related. I did the same a while ago for some Flash games, where I generated haXe code from text files (levels) and xml files from folder structures, that were fed into a .swf generator. Saved me a lot of time and made things much more reliable. A friend of mine is generating database interfacing code for various languages and platforms with a single button-press.

You are probably dealing with different situations than I, though, so you'll have to compare the advantages to the disadvantages for your specific situation. Your system looks sufficiently complex to introduce additional maintenance (fixing bugs in the generator, taking care of corner-cases) and you'll need to ensure the output is correct (both compile-time and run-time), which is more difficult because you didn't write it directly (which is why C++ template error messages are so hard to read by default). If that's worth the increased productivity and flexibility, then that's great.

At some point however, you're probably better off integrating a Python interpreter into your program, or calling C++ code from Python. If your code fluctuates that heavily, then C++ is probably not the right language for you - or you need to adapt your development style more to C++.

Having said that, I would keep things practical if I were you. Identify the problem cases and make a solution that works for those. Don't try to over-generalize at first, just make something that solves some actual problems and annoyances. You can always expand from there. Besides, you'll gain some practical, tangible experience.

##### Share on other sites
It's unlikely that you'll be able to create a generator that produces optimal code in many circumstances. I realize however that comments like that can be highly motivational in an I'll-show-you kinda way :)

This sub-par efficiency might not bother you, but that means you've lost perhaps the only tangible benefit that C++ provides over other languages: speed. But if you're willing to sacrifice a little speed, why not use a more expressive language in the first place? Lisp and Haskell come to mind, particularly because higher order functions would solve a lot of your problems.

If you don't know them already, you'll have to learn a new language of course, but that perhaps is less of an issue than that of creating, maintaining and debugging the code generator. And will you understand either the generation or generated code a month or two down the line?

##### Share on other sites
Thanks for the replies. I suppose 'preprocessor' would be the wrong word.

This script was really a mild amusement -- a just-for-fun sort of thing. After playing around with it for a little I did realize that the potential problems it introduces could very easily out-way the benefits. I was kind of fun though.

I don't know how many might have looked at the code, but the 'stl_container_help' function is definitely ridiculous :D. It would cause more harm then help, since the programmer never really knows exactly what container he's working with!

As far as C++ annoying me, usually it only happens when I want to change the arguments passed to functions. I suppose with good class design and a liberal use of typedefs most of my troubles might go away.

##### Share on other sites
Just for fun is always a cool thing, especially such meta coding [smile]

The only criterion from my side is: You said the cpp is very limited, but forgot to mention c++ templates. How would you compare to them w.r.t. limitations?

##### Share on other sites
Just a small addition: this topic reminded me of pygccxml, a Python library that can parse C++ code and generate a heap of information on it.

##### Share on other sites
You can already generate C++ code from C++ expressions using expression templates.

You can write any Domain-Specific Embedded Language in C++. A parser generator for example, Boost.Spirit being the best known one.

##### Share on other sites
Quote:
 Original post by loufoqueYou can already generate C++ code from C++ expressions using expression templates.You can write any Domain-Specific Embedded Language in C++. A parser generator for example, Boost.Spirit being the best known one.

You may not write any embedded language, but rather you are able to solve any problem at compile time. Though the syntax can get really obscure.

Btw, there is an implementation of IEEE floating point math by Thierry Berger-Perrin, who even wrote a toy compile time ray tracer: metafloat, compile-time ray tracing.