How to store data for reflection in preprocessor stage

Started by
11 comments, last by Randy Gaul 9 years, 1 month ago
Olof Hedman have the right idea. C++ can let you approach reflection in a unique way: do (almost) everything at compilation time.
I've experimented and played around a lot with that in a couple of pet projects as well as in some production code at my previous company. The solution I've ended up with that I liked the most is to describe everything at compilation time using template specialization, much like type traits. From various proposals for compilation time reflection for the C++ standard that I've seen, I think other people have used similar approaches to reflection in c++ as they are proposing things that work generally in a similar way.
I'm going to oversimplify this, but this approach is very extensible. I've used it to automatically generate language bindings completely at compilation time, to serialize graphs of objects, and a few other things.

Basically, I have a reflection::traits template that I specialize to describe every item I want my reflection system to describe. It is defined like so:


namespace reflection
{
    template< typename T > struct traits {};
}

I then have a specialization of it for each thing I want to describe, and which contain static methods, typedefs, etc. depending on what kind of thing I'm describing.

For instance if I have a class named Foo, I'll have a specialization of the reflection template that looks like this:

template<> struct reflection::traits< Foo >
{
    static const char* Name() { return "Foo"; }
};
At runtime, I now can now the name of class Foo by calling this: reflection::traits< Foo >::Name()
Of course, just getting the name isn't really useful. What I really want to do is to enumerate all the members. I do it using a compilation time visitor pattern. I guess it could be possible to use a more functional programming style, using tuples or something similar but I find the visitor approach less daunting.
In my previous company we only used this to serialize things, so I was doing something like to describe member variables:

template<> struct reflection::traits< Foo >
{
  static const char* Name() { return "Foo"; }


  template< typename V > accept( V& visitor )
  {
    visitor.template memberVar( "Blah", &Foo::m_Blah );
    visitor.template memberVar( "Thing", &Foo::m_Thing );
  }
};

It was all done using a bunch of macros so it looked something like this:


REFLECTION_START( Foo )
    CLASS_MEMBER_VAR( Blah )
    CLASS_MEMBER_VAR( Thing )
REFLECTION_END

The reflection::traits specialization had to be declared friend, so I had another macro for that. It is the only thing I need to insert into the definition of my classes, other than that this approach is non-intrusive, all the reflection stuff lives completely on the side.

It is possible to do much more complicated things, though: just create a dummy type for each member variable / property and create specific specializations of reflection::traits for those where you can then put whatever you need, like pointers to setter/getter member functions).

Likewise, macros are just one way to go about it. On my pet project I do more complicated things so I have a generator that generate all those templates from descriptions written in a simple language (I just don't like to insert annotations in the C++ code itself, I think it's both more complicated and less neat).

Then I can for instance print the member variables of any class described using this system by doing something like this:

template< class C >
class PrintObjectVisitor
{
public:
    WriteVisitor( const C& object, ostream& output ) :
        m_object( object ),
        m_output( output )
    {}

    template< typename T > void memberVar( const char* pName, T C::* mpVar )
    { 
        output << "  " << pName << ": " << m_object.*mpVar;
    }

private:
    const C& m_object;
    ostream& m_output;
}


template< typename C >
void PrintObject( const C& obj )
{
    PrintObjectVisitor v( obj, cout );
    reflection::traits< C >::accept( v );
}
The visitor can have methods besides "memberVar" to describe other aspects of the class, using template functions to pass along the required type (so that the visitor can then use reflection on that type in the same way and so on). For instance, you could have a method to tell the visitor about the superclasses of the class. It can then recursively visit them to print their member variables too.
You can use this to attach reflection information and implement visitor patterns for other things than classes. For namespaces, I just create a dummmy type in the namespace:

namespace Bar
{
    struct reflection_tag {}
}
Then specialize reflection::traits for "Bar::reflection_tag" to describe reflection informations about that namespace, including a function that goes through all the classes and nested namespace that it contains using a compile-time visitor like above.
Likewise, I create dummy structs in the class reflection traits to identify methods, properties and so on and then specialize the reflection::traits class for those to describe everything I need to describe about those things, depending on what I need to do.
The nice thing is that for most things, you pay no runtime cost for all that. That PrintObject function above, for instance, gets compiled into completely straight forward code that will just print each variable of the object without performing any lookup through a container at runtime. Furthermore, you don't get a bunch of data you don't need compiled into your binaries. If you only need to serialize a bunch of data in a binary blob, you don't need the class and variable names as long as you can identify the format version (I was doing it by using that same system to generate a CRC of all the classes description - it was enough for us since we used this only for network communication and it allowed to make sure that both the client and server were able to serialize and unserialize the exact same types). By the way in a modenr C++ compiler, things like computing CRCs such as this could be also done completely at compilation time using constexpr.
Another plus of this method is that you don't need to add stuff into the classes themselves, it lives completely on the side. You can make macros to build all those template specializations, I've did that at my prevoous company. However, in my personal projects I'm doing more sophisticated stuff using this approach than just serialization (like scripting language bindings), so I wrote a tool that generate those. I didn't want to use a Qt or unreal like approach of inserting special annotations through my c++ classes though, just a matter of taste but I find this messy. Instead, I have a simple interface description living in their own files, using a simple description language that ressembles a very simplified subset of c++, where I describe namespaces, classes, methods and properties. Then I have a tool that spits out a bunch of header files containing all those reflection::traits specialization, and from that I can generate serialization, language bindings and such entirely through a bunch of templates.
Its also possible to use all this to construct a more classical system that allows introspection at runtime, but I'd only use that for things like property editor UIs and the like.
Advertisement
Great post, Zlodo, thank you.

I wouldn't bother doing a lot at compile time. I prefer to just load everything by explicitly calling functions. If you really don't like this for some reason you can instead use the constructor of an object at file scope to run data type registration code.

Example:

REGISTER_TYPE( int );

Would create some instance of a class right there at file scope. The constructor could be given a string "int", the sizeof( int ), and anything else you want. When the program starts up this information can be stored and used later.

This topic is closed to new replies.

Advertisement