C++ Template Metaprogramming Problem

Started by
17 comments, last by Extrarius 18 years, 5 months ago
I'm working on implementing a rather complicated metaprogramming system in C++ using a system quite a bit like that used in boost (with metafunction classes that have a templated Apply function that defines a Result type), but I'm not sure how to implement something using that methodology: Basically, I want a metafunction class BuildString which takes a character as a template parameter. This BuildString<char Character> then defines a templated Apply function which defines two things: It defines a Result type equil to it's template parameter ('identity function'), and it defines a static member of type std::string that is initialized to std::string(1, Character). So far, so good. I have exactly that:
template <char Character>
struct BuildString
{
	template<class Param1>
	struct Apply
	{
		typedef Param1 Result;
		static const std::string Output;
	};
};
template <char Character>
template <class Param1>
const std::string BuildString<Character>::Apply<Param1>::Output = std::string(1, Character);


It works perfect, as far as I can tell (BuildString<'A'>::Apply<void>::Output is "A"). Unfortunately, that's not all I need it do to. I also want to specialize the definition of Output such that if Apply's Param1 is a BuildString<NestedChar>::Apply<Anything>::Result, instead of being a single character, Output will be std::string(1, Character) + BuildString<NestedChar>::Apply<Anything>::OutputString so I can nest them to build up a string into the Output of the outermost BuildString. Is it at all possible to do this? I've tried quite a few different things and I can't get any of them to work (some compile but don't have expected result - it always just uses a single character for Output). If there is another way to build up strings a character at a time that will fit with this ::Apply<> metafunction framework, I wouldn't mind having to use something completely different (though something easy to use would be preferred), as long as I can build up strings a character at a time with the character being a template function. Also, just as an aside, does the standard allow limits on things like this? I know there can be a recursion limit, but what about when you're not actually iterating, such as BuildString<'A'>::Apply<BuildString<'B'>::Apply<BuildString<'C'>::Apply<Q>::Result>::Result>::Output etc replacing Q with more buildstring metacalls?
"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
Advertisement
Eh? You want to concatenate literals at compile-time and have the result behave like a std::string? You won't be able to avoid a run-time std::string ctor call as far as I can figure, anyway. Why not just do the usual things for concatenation and wrap the whole thing in the ctor call? I.e. const char * bar = "bar"; std::string("foo" bar "baz");
The reason I need this is that I need to be able to integrate it with a rather large library I'm working on (for fun =-) to do a LOT of stuff using template metaprogramming. The library can already do lots of awesome stuff (well, theoretically), but there isn't a way to output the result all the processing it does. If I had this, I could make everything start with "" for output and 'bubble' output up through the whole system to the final result.

I don't care how it actually gets the result to the top (As long as it's not through a single global, because the library will need different instantiations to create different strings), but I have a whole system of classes fitting the pattern of BuildString (normal classes with a templated Apply that defines a type Result), and the idea is that putting a BuildString<'C'> into the chain somewhere it should add 'C' to the growing string so that at the end, the string will contain the proper characters in the proper order.

The only reason I'm using std::string is because I figured it'd be easiest to use it's simple operator+ to do the concatenation instead of defining some kind of crazy StringCat metafunction that would work with fixed-sized character arrays.

If you have any idea at all how to acomplish this goal (have a template metaprogram return a string that is modified by multiple metafunctions), feel free to mention any alternatives =-)

If you want to know more about the kind of model I'm using, you can look at Boost::MPL, especially the tutorial. Basically, the main difference is that instead of apply I use Apply and instead of ::type I use ::Result
"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
Quote:Original post by Extrarius
I also want to specialize the definition of Output such that if Apply's Param1 is a BuildString<NestedChar>::Apply<Anything>::Result, instead of being a single character, Output will be std::string(1, Character) + BuildString<NestedChar>::Apply<Anything>::OutputString so I can nest them to build up a string into the Output of the outermost BuildString.

Don't use class-static const strings here, since the order of construction between them is not fully defined. Prefer static const objects in member functions or just locals returned by value. Even better, logically concatenate them at compile-time and put them into a char array of appropriate length, initialized with an initializer list.

As well, watch out for nested template instantiation depth. The standard recommends that compilers have at least a depth of 17, though most compilers do go much deeper than that. Still, keeping it low can have a noticeable impact on compile-times.

Really though, you're much better off using preprocessor metaprogramming to generate a variadic-style template parameter list and then pass all of your characters to that (i.e. instantiated like string< 'a', 'b', 'c', 'd' >). Then define metafunctions for concatenating two strings, etc. However, why build that from scratch when you can use a Boost.MPL vector_c of characters inside the implementation of your compile-time string type? Even if you are just doing it for the sake of learning, I really wouldn't recommend just working entirely from scratch as it's all pretty simple stuff, just tedious to implement, becoming more of a project in preprocessor metaprogramming if you want to do it correctly rather than template metaprogramming.
Quote:Original post by Polymorphic OOP
[...]As well, watch out for nested template instantiation depth. The standard recommends that compilers have at least a depth of 17, though most compilers do go much deeper than that. Still, keeping it low can have a noticeable impact on compile-times.[...]

By this, do you mean that with a class like so:
struct I{	template<class Param1>	struct Apply	{		typedef Param1 Result;	};};
(a simple identity metafunction) that there is a maximum number of times I can do the following:
typedef X Type1;typedef I::Apply<Type1>::Result Type2;typedef I::Apply<Type3>::Result Type3;typedef I::Apply<Type4>::Result Type4;//... etc ... for TypeN
So that I couldn't (depending on the compiler) have a Type65536 declared in such a way (presuming the compiler doesn't 'optimize' the iteration out and just assign them all type X)? I know that a recursion limit is allowed, but I was really hoping I could do the above as many times as I want and still be assured that every standards compliant compiler will correctly compile the code.
"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
I wouldn't think you'd have much luck using std::string for compile time concatenation, unfortunately. It's hard to imagine any substitute for this either.

Though I probably don't know any more than you do in this area. There's my 2c anyway.

Edit: Wow believe it or not, I had this window open so long (behind others), that at the time I posted, no one else had. It's good to know that I was on the right track anyway.
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
In case you're wondering what I'm trying to do, I'm trying to implement a turing complete language entirely via templates (which shouldn't be possible since the template language is supposedly not turing complete). The language I'm trying to implement is a minimalist version of Unlamda[1], leaving out the D and C functions since I'm not sure how to implement them via templates (or if it's possible). I do, however, have the SKI combinator trio implemented, and for the most part they work - I can construct an expression and get a (usually correct) result by using typeid().name() to output the final value. This is an unsatisfactory output method, though, because it is very difficult to read (end up with things like "struct S::Apply(struct I)::Result::Apply(struct I)::Result" for (S I I) because of currying) and also because it's entirely compiler-dependant (and might just return an empty string iirc) so it can't actually be used to calculate something at compile time that is later used at run time (the recursion limit eliminates the possibility of walking the answer like a tree/list to convert types into simple strings - I thought it didn't apply to the nested approach but I'm not sure now since Polymorphic OOP hasn't replied to my question).

If, however, I could get a build-string-char-by-char function working, then I could use it to build usable output in the same way that Unlambda[1]s dot-char function works (in fact, I could then translate many unlambda program straight to template metacode).

[1] http://en.wikipedia.org/wiki/Unlambda

-Extrarius
Quote:Original post by Extrarius
By this, do you mean that with a class like so:

It means that the standard recommends that compilers allow you to recursively instantiate a metafunction at least 17 times (only a recommendation, most go much beyond that, but still something to consider). Anyway, the main point is, don't rely on instantiating a template 50 times recursively just to make a 50 character long string, regardless of what your compiler can handle. You're just going to end up with an extremely complex instantiated type and, depending on how you further work with the type, you're probably going to have high complexity for many operations. If you just use preprocessor metaprogramming to generate a template parameter list of a fairly large size and use that, then make other intrinsic metafunctions for dealing with the string instantiations you'll get a much easier template to work with and one which will most-likely compile much faster.

But really, if you went that route, you might as well just use Boost.MPL.

Quote:Original post by Extrarius
In case you're wondering what I'm trying to do, I'm trying to implement a turing complete language entirely via templates (which shouldn't be possible since the template language is supposedly not turing complete).

Actually, templates are turing complete.

Quote:Original post by Extrarius
The language I'm trying to implement is a minimalist version of Unlamda

In that case have fun, and continue on with whatever choices are appropriate to reach the paradigm you want. Just keep in mind that template facilities aren't best suited for this style of programming so you'll probably encounter long compile-times and possibly reach compiler limits along the way.

Quote:Original post by Extrarius
I can construct an expression and get a (usually correct) result by using typeid().name() to output the final value.

Yeah, definately don't do that. Perform everything logically at compile-time and then instantiate one type to represent the results of arbitrarily complex operations performed at compile-time when requested. Don't use, for instance, runtime string concatenation facilities as you described in your previous post, just build one single runtime string when requested, initialized strictly with data already concatenated at compile-time, as the information is all known at compile-time anyway.

As a side note, if you are just using the string at runtime and are using runtime string concatenation facilities anyway, then there is no reason for you to be using template metaprogramming to begin with, as expression templates would generally provide a superior alternative. Use template metaprogramming if the result is needed at compile-time, such as to calculate a value needed to instantiate another template or to create a complex type from data which needs to be known at compile-time. Here it just sounds like you want all of the results at runtime.

Quote:Original post by Extrarius
If, however, I could get a build-string-char-by-char function working, then I could use it to build usable output in the same way that Unlambda[1]s dot-char function works (in fact, I could then translate many unlambda program straight to template metacode).

You can do that, an by all means, don't let the recommended compiler limit stop you (again, most compilers go well beyond it). I wouldn't wory about the limit much as I doubt you'll be using it much for extremely complex data, but its still something to consider and shows that templates generalyl aren't best-suited for this particular type of metaprogramming. Anyway, it's more compile-times and ease of use that I'd worry about since the syntax is going to start looking hairy pretty quickly. It's not exactly a style of programming that I'd want to use with templates.
Quote:Original post by Polymorphic OOP
[...]Actually, templates are turing complete.[...]
Since a recursion limit is allowed, templates CAN be turing complete (if your specific compiler has no recursion limit), but they don't HAVE to be. Thus, by the standard, you can't count on templates being turing complete.

Quote:[...]just build one single runtime string when requested, initialized strictly with data already concatenated at compile-time, as the information is all known at compile-time anyway.[...]
I'm not sure how to do it this way any more than any other way. I would greatly appreciate a more detailed explanation. Realize, though, that whatever method I use must be embeddable within a chain of combinators (and it should act like the Identity Combinator 'I' that I posted above, as far as how it defines it's Result type).

Quote:[...]the syntax is going to start looking hairy pretty quickly. It's not exactly a style of programming that I'd want to use with templates.
Not really:
(S I I (S I I)) becomes:#define Eval(F,A) F::Apply<A>::Resulttypedef Eval(Eval(S, I), I) SII;typedef Eval(SII, SII) SII_SII;
Not pretty, but not really that bad as far as template metaprogramming goes. Of course, it won't be pretty for functions that do any real computation, but neither is the actual combinator language or it's base, lambda calculus. It's not part of the specification to be pretty =-)

If it always gave the correct answer, it'd be pretty impressive (IMO anyways).
(S I I (S I I)) should evaluate to (S I I (S I I)) - it can't be simplified and after a few evaluations you end up with the same thing you started with - but for some reason it comes back as (S I I) despite the fact that many other test cases return the proper answers. I expected it to fail to compile (recursion limit), but the incorrect result is extremely puzzling:

(S I I (S I I)) => {(S x y z) means ((x z) (y z))}
(I (S I I) (I (S I I)) => Outer=(S I I), Inner=(S I I) {Verified up to here}
(Outer Inner) ?? => ?? (S I I)
I'm not sure what causes the last step to evaluate to (S I I)
Is this caused by a recursion limit? I thought that when the limit was reached, the compiler would emit an error of some kind. If incorrect behavior is allowed, it's quite puzzling.

Should I post my definition of metafunction classes S and K ('Combinator I' was posted earlier)? Anybody interested in helping me debug some very simple template code that would be turing complete without recursion limits?

[Edited by - Extrarius on November 4, 2005 5:12:57 PM]
"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
Quote:Original post by Extrarius
Since a recursion limit is allowed, templates CAN be turing complete (if your specific compiler has no recursion limit), but they don't HAVE to be. Thus, by the standard, you can't count on templates being turing complete.

Not at all. By standard you can consider the turing complete. There are always going to be limits for tons of features in any language implementation, not just templates in C++; that doesn't make them any less turing complete, acknowledging them is just being realistic. The C++ standard, for instance, has dozens of recommendations for minimum compiler limits in other areas of the language -- that doesn't mean that going beyond them is nonstandard code, they are just recommendations and don't imply compliance on any level.

Templates are turing complete, it's just that a compiler is generally going to have limits, just like there are often limits for many other features of any language. They can still theoretically be recursively instantiated forever, whether your compiler can handle it or not is another story. The standard merely acknowledges that fact and provides reasonable guidelines for an implementor to follow as recommendations.

A compiler for any functional language may have a limit to how deep recursion goes, that doesn't make the language any less turing complete, again, it's just being realistic.

Quote:Original post by Extrarius
I'm not sure how to do it this way any more than any other way. I would greatly appreciate a more detailed explanation. Realize, though, that whatever method I use must be embeddable within a chain of combinators (and it should act like the Identity Combinator 'I' that I posted above, as far as how it defines it's Result type).

The most easily optimized way, but having limitation based on a size you would personally decide, would be to create a static const char array large enough to hold the entire string, and then just use an initializer list in the initialization of the char array, with each element being a compile-time constant taken from the template instantiations, thereby making the array initialized prior to any constructors for any types, therefore avoiding problems with construction order.

Short of that, I'd recommend using a static const char array reference inside of a function which is initialized to a static const char array created inside of another function. Inside of the function which directly has the static char array, use template metaprogramming to unravel a simple loop which just copies the elements of the compile-time string into the static char array. Since the local array reference in the other function is also static, the function doing the initialization is only called once, and since the initialization is based off of all compile-time constants and since you are just interfacing with it through a reference, there's going to be minimal overhead. As well, since you are using static arrays inside of functions, you also avoid any construction order problems.

If you want the result to be a ::std::string, just initialize the string with the char array in the above examples. Still, you shouldn't ever be resorting to fully runtime concatenations considering its based around template metaprogramming.

Quote:Original post by Extrarius
I thought that when the limit was reached, the compiler would emit an error of some kind.

It will emit an error if it's reached. I didn't look through your logic, but if you are having problems it's most-likely do to a mistake in code.

This topic is closed to new replies.

Advertisement