Using Python as a C++ Preprocessor

Started by
7 comments, last by phresnel 15 years, 3 months ago
The other day I was thinking about C++ and how it sometimes annoys me, then I thought that maybe a more powerful preprocessor could help with writing more flexible source code. The built-in C/C++ preprocessor is extremely limited in what it can do, so it cannot really help programmers to expand the language as they need it. For the fun of it I sat down and spent a few minutes writing a simple Python script as a proof-of-concept sort of test. I have mixed feelings about it. The power of it could help make a lot of things a lot easier, but it could also be horribly exploited if used improperly. Some things it might be good for: * Adding keyword and optional arguments to function calling, which would make it easier to add more arguments to functions in the future without having to change all function calls. * Creating a high level interface to lower level functionality. For example, someone could write a cross-platform GUI module that generates light-weight platform specific code. * Reducing code bloat by only having functionality that is really needed at run-time instead of also being designed to help the programmer. I was wondering what others might think about using a scripting language as a preprocessor? Like I said, I have mixed feelings about it. Here is a sample source file I used for testing:

{?
import math
import time

funcs_num = 0
include_list = []

def additional_includes():
	global include_list
	for i in include_list:
		echo("#include <%s>\n" % i)
	
	
def add_include(filename):
	global include_list
	global funcs_num
	funcs_num += 1
	if not filename in include_list:
		include_list.append(filename)

def py_func(a):
	global funcs_num
	funcs_num += 1
	echo(a)
	return a
	
	
def recurse(a):
	global funcs_num
	funcs_num += 1
	if a <= 0:
		return
	echo("\tcout << \"" + "+" * a + "\" << endl;\n")
	recurse(a-1)
	echo("\tcout << \"" + "-" * a + "\" << endl;\n")
	
	
def rgb(red, green, blue):
	global funcs_num
	funcs_num += 1
	return (blue << 16) + (green << 8) + red
	
	
def wave(a):
	global funcs_num
	funcs_num += 1
	for i in xrange(a):
		echo("\t//"+ (21 + int(math.sin(i*0.2) * 20)) * " " + "*\n")
		
		
def some_func(a=0, b=1, c=2):
	global funcs_num
	funcs_num += 1
	# I wonder if there is a more reusable generic way to do this.
	echo("some_func(%i, %i, %i)" % (a,b,c))
	
	
def number_of_function_calls():
	global funcs_num
	funcs_num += 1
	code("echo(funcs_num)")
	
	
def stl_container_help(type="int", key="std::string", description=""):
	global funcs_num
	funcs_num += 1
	
	desc = [i.strip().lower() for i in description.split(",")]
	
	use_key = False
	
	if "ordered" in desc:
		if "last in first out" in desc:
			add_include("stack")
			return "std::stack< %s >" % type
			
		if "first in first out" in desc:
			add_include("queue")
			return "std::queue< %s >" % type
			
		if "largest eliment first out" in desc:
			add_include("priority_queue")
			return "std::priority_queue< %s >" % type
			
		if "sorted by key" in desc:
			use_key = True
		else:
			if "insert and erase in middle" in desc:
				add_include("list")
				return "std::list< %s >" % type
			
			if "insert and erase at front" in desc:
				if "need to merge collections" in desc:
					add_include("list")
					return "std::list< %s >" % type
				else:
					add_include("deque")
					return "std::deque< %s >" % type
			else:
				if "need to find nth element" in desc:
					if "size will vary wildly" in desc:
						add_include("deque")
						return "std::deque< %s >" % type
					else:
						add_include("vector")
						return "std::vector< %s >" % type
				else:
					if "need to merge collections" in desc:
						add_include("list")
						return "std::list< %s >" % type
					else:
						if "size will vary wildly" in desc:
							add_include("deque")
							return "std::deque< %s >" % type
						else:
							add_include("vector")
							return "std::vector< %s >" % type
	else:
		if "need to find element by key" in desc:
			use_key = True
		else:
			if "need to merge collections" in desc:
				add_include("list")
				return "std::list< %s >" % type
			else:
				if "size will vary wildly" in desc:
					add_include("deque")
					return "std::deque< %s >" % type
				else:
					add_include("vector")
					return "std::vector< %s >" % type
		
	if use_key:
		if "allow duplicates" in desc:
			if "store key separate to value" in desc:
				add_include("multi_map")
				return "std::multi_map< %s , %s >" % (key, type)
			else:
				add_include("multi_set")
				return "std::multi_set< %s , %s >" % (key, type)
		else:
			if "store key separate to value" in desc:
				add_include("map")
				return "std::map< %s , %s >" % (key, type)
			else:
				add_include("set")
				return "std::set< %s , %s >" % (key, type)
			
?}

/*
	{?echo("*" * 15)?}
	Last build on: {?echo(time.asctime())?}
*/

#include <iostream>
#include <string>
// Automatically add needed include files
{?code("additional_includes()")?}

// Defining defined values
#define BLACK {?echo(rgb(0,0,0))?}
#define RED {?echo(rgb(255,0,0))?}
#define GREEN {?echo(rgb(0,255,0))?}
#define BLUE {?echo(rgb(0,0,255))?}
#define YELLOW {?echo(rgb(255,255,0))?}
#define CYAN {?echo(rgb(0,255,255))?}
#define MAGENTA {?echo(rgb(255,0,255))?}
#define WHITE {?echo(rgb(255,255,255))?}

using namespace std;

// This will later be used for keyword arguments
int some_func(int a, int b, int c)
{
	return a + b + c;
}

int main(int argc, char** argv)
{
	// This file calls roughly {?number_of_function_calls()?} python functions
	
	// Container class selection based on programmer's needs
	// (Honestly, this might be a terrible idea!)
	{?tmp="insert and erase at front, need to merge collections"?}
	{?container = stl_container_help(type = "std::string", description = tmp)?}
	{?echo(container)?} my_container;
	{?echo(container)?}::iterator my_iter;
	{?tmp="need to find element by key, store key separate to value"?}
	{?other_container = stl_container_help(type = "unsigned int", key = "std::string", description = tmp)?}
	{?echo(other_container)?} my_other_container;
	{?echo(other_container)?}::iterator my_other_iter;
	
	// Calling simple python function
	cout << {?py_func("Hello")?} << endl;
	
	// Loop unrolling
{?
for i in xrange(5):
	echo("\tcout << %i << endl;\n" % i)
?}
	
	// Unrolling a recursive function
{?recurse(5)?}
	
	// Basic operations
	cout << {?echo(2**16)?} << endl;
	
	// Generating code from a list
	cout << "The Three Stooges are:" << endl;
{?
stooges = ["Larry", "Curly", "Moe"]
for stooge in stooges:
	echo("\tcout << \"%s\" << endl;\n" % stooge)
?}
	
	// Almost like a macro?
	cout << "The integer representation of white is " << {?echo(rgb(255,255,255))?} << endl;
	
{?wave(50)?}

	// Keyword arguments!
	{?some_func(b=15)?}
	{?some_func()?}
	{?some_func(c=9, a=12)?}
	
	return 0;
}


Here is the output after running the script:

/*
	***************
	Last build on: Thu Jan 22 17:31:11 2009
*/

#include <iostream>
#include <string>
// Automatically add needed include files
#include <list>
#include <map>


// Defining defined values
#define BLACK 0
#define RED 255
#define GREEN 65280
#define BLUE 16711680
#define YELLOW 65535
#define CYAN 16776960
#define MAGENTA 16711935
#define WHITE 16777215

using namespace std;

// This will later be used for keyword arguments
int some_func(int a, int b, int c)
{
	return a + b + c;
}

int main(int argc, char** argv)
{
	// This file calls roughly 25 python functions
	
	// Container class selection based on programmer's needs
	// (Honestly, this might be a terrible idea!)
	
	std::list< std::string > my_container;
	std::list< std::string >::iterator my_iter;
	
	std::map< std::string , unsigned int > my_other_container;
	std::map< std::string , unsigned int >::iterator my_other_iter;
	
	// Calling simple python function
	cout << Hello << endl;
	
	// Loop unrolling
	cout << 0 << endl;
	cout << 1 << endl;
	cout << 2 << endl;
	cout << 3 << endl;
	cout << 4 << endl;

	
	// Unrolling a recursive function
	cout << "+++++" << endl;
	cout << "++++" << endl;
	cout << "+++" << endl;
	cout << "++" << endl;
	cout << "+" << endl;
	cout << "-" << endl;
	cout << "--" << endl;
	cout << "---" << endl;
	cout << "----" << endl;
	cout << "-----" << endl;

	
	// Basic operations
	cout << 65536 << endl;
	
	// Generating code from a list
	cout << "The Three Stooges are:" << endl;
	cout << "Larry" << endl;
	cout << "Curly" << endl;
	cout << "Moe" << endl;

	
	// Almost like a macro?
	cout << "The integer representation of white is " << 16777215 << endl;
	
	//                     *
	//                        *
	//                            *
	//                                *
	//                                   *
	//                                     *
	//                                       *
	//                                        *
	//                                        *
	//                                        *
	//                                       *
	//                                     *
	//                                  *
	//                               *
	//                           *
	//                       *
	//                    *
	//                *
	//             *
	//         *
	//      *
	//    *
	//  *
	//  *
	//  *
	//  *
	//    *
	//      *
	//         *
	//            *
	//                *
	//                    *
	//                       *
	//                           *
	//                              *
	//                                  *
	//                                    *
	//                                      *
	//                                        *
	//                                        *
	//                                        *
	//                                       *
	//                                      *
	//                                   *
	//                                *
	//                             *
	//                         *
	//                     *
	//                  *
	//              *


	// Keyword arguments!
	some_func(0, 15, 2)
	some_func(0, 1, 2)
	some_func(12, 1, 9)
	
	return 0;
}


Here's the script -- which is obviously not anywhere near a real usable state.

#!/usr/bin/env python

__script_out = ""

def echo(s):
	global __script_out
	__script_out += str(s)
	
def code(s):
	echo("{?" + s + "?}")
	

def __main():
	global __script_out
	
	input_source = open("/home/nathan/temp/test.cpp").read()
	
	output = input_source
	
	script = ""
	
	another_pass = True
	while another_pass:
		another_pass = False
		code = output
		output = ""
		cursor = 0
		start = 0
		end = 0
		
		done = False
		while not done:
			rstart = code.find("{?",cursor)
			rend = code.find("?}", cursor)
			
			if (rstart != -1) and (rend != -1):
				another_pass = True
				start = rstart
				end = rend
				script = code[start+2:end]
				output += code[cursor:start]
				cursor = end + 2
				__script_out = ""
				exec(script, globals(), globals())
				output += __script_out
			else:
				output += code[end+2:]
				done = True
		
		#print output
		#print "-----"
	
	print output
	
	output_file = open("output.cpp", "wt")
	output_file.write(output)
	
	return 0

if __name__ == '__main__': __main()

#print(dir())


So what are any of your thoughts on the matter?
Advertisement
Interesting, certainly, although I think preprocessor is the wrong word - you've essentially written a C++ generator in python.

My problem is... What's the point? :)
Indeed, this ventures more in the realm of code-generating. Generating code is a powerful tool. I occasionally employ Python scripts to generate C++ code, although mostly data-related. I did the same a while ago for some Flash games, where I generated haXe code from text files (levels) and xml files from folder structures, that were fed into a .swf generator. Saved me a lot of time and made things much more reliable. A friend of mine is generating database interfacing code for various languages and platforms with a single button-press.


You are probably dealing with different situations than I, though, so you'll have to compare the advantages to the disadvantages for your specific situation. Your system looks sufficiently complex to introduce additional maintenance (fixing bugs in the generator, taking care of corner-cases) and you'll need to ensure the output is correct (both compile-time and run-time), which is more difficult because you didn't write it directly (which is why C++ template error messages are so hard to read by default). If that's worth the increased productivity and flexibility, then that's great.

At some point however, you're probably better off integrating a Python interpreter into your program, or calling C++ code from Python. If your code fluctuates that heavily, then C++ is probably not the right language for you - or you need to adapt your development style more to C++.


Having said that, I would keep things practical if I were you. Identify the problem cases and make a solution that works for those. Don't try to over-generalize at first, just make something that solves some actual problems and annoyances. You can always expand from there. Besides, you'll gain some practical, tangible experience.
Create-ivity - a game development blog Mouseover for more information.
It's unlikely that you'll be able to create a generator that produces optimal code in many circumstances. I realize however that comments like that can be highly motivational in an I'll-show-you kinda way :)

This sub-par efficiency might not bother you, but that means you've lost perhaps the only tangible benefit that C++ provides over other languages: speed. But if you're willing to sacrifice a little speed, why not use a more expressive language in the first place? Lisp and Haskell come to mind, particularly because higher order functions would solve a lot of your problems.

If you don't know them already, you'll have to learn a new language of course, but that perhaps is less of an issue than that of creating, maintaining and debugging the code generator. And will you understand either the generation or generated code a month or two down the line?
Thanks for the replies. I suppose 'preprocessor' would be the wrong word.

This script was really a mild amusement -- a just-for-fun sort of thing. After playing around with it for a little I did realize that the potential problems it introduces could very easily out-way the benefits. I was kind of fun though.

I don't know how many might have looked at the code, but the 'stl_container_help' function is definitely ridiculous :D. It would cause more harm then help, since the programmer never really knows exactly what container he's working with!

As far as C++ annoying me, usually it only happens when I want to change the arguments passed to functions. I suppose with good class design and a liberal use of typedefs most of my troubles might go away.
Just for fun is always a cool thing, especially such meta coding [smile]

The only criterion from my side is: You said the cpp is very limited, but forgot to mention c++ templates. How would you compare to them w.r.t. limitations?
Just a small addition: this topic reminded me of pygccxml, a Python library that can parse C++ code and generate a heap of information on it.
Create-ivity - a game development blog Mouseover for more information.
You can already generate C++ code from C++ expressions using expression templates.

You can write any Domain-Specific Embedded Language in C++. A parser generator for example, Boost.Spirit being the best known one.
Quote:Original post by loufoque
You can already generate C++ code from C++ expressions using expression templates.

You can write any Domain-Specific Embedded Language in C++. A parser generator for example, Boost.Spirit being the best known one.


You may not write any embedded language, but rather you are able to solve any problem at compile time. Though the syntax can get really obscure.

Btw, there is an implementation of IEEE floating point math by Thierry Berger-Perrin, who even wrote a toy compile time ray tracer: metafloat, compile-time ray tracing.

This topic is closed to new replies.

Advertisement