Maintenance-free Enum to String in Pure C++ with "Better Enums"

Published November 20, 2015 by Anton Bachin, posted by antron
Do you see issues with this article? Let us know.
Advertisement

Background

Enums are used in game programming to represent many different things - for example the states of a character, or the possible directions of motion: enum State {Idle, Fidget, Walk, Scan, Attack}; enum Direction {North, South, East, West}; During debugging, it would be useful to see "State: Fidget" printed in the debug console instead of a number, as in "State: 1". You might also need to serialize enums to JSON, YAML, or another format, and might prefer strings to numbers. Besides making the output more readable to humans, using strings in the serialization format makes it resistant to changes in the numeric values of the enum constants. Ideally, "Fidget" should still map to Fidget, even if new constants are declared and Fidget ends up having a different value than 1. Unfortunately, C++ enums don't provide an easy way of converting their values to (and from) string. So, developers have had to resort to solutions that are either difficult to maintain, such as hard-coded conversions, or that have restrictive and unappealing syntax, such as X macros. Sometimes, developers have also chosen to use additional build tools to generate the necessary conversions automatically. Of course, this complicates the build process. Enums meant for input to these build tools usually have their own syntax and live in their own input files. The build tools require special handling in the Makefile or project files.

Pure C++ solution

It turns out to be possible to avoid all the above complications and generate fully reflective enums in pure C++. The declarations look like this: BETTER_ENUM(State, int, Idle, Fidget, Walk, Scan, Attack) BETTER_ENUM(Direction, int, North, South, East, West) And can be used as: State state = State::Fidget; state._to_string(); // "Fidget" std::cout << "state: " << state; // Writes "state: Fidget" state = State::_from_string("Scan"); // State::Scan (3) // Usable in switch like a normal enum. switch (state) { case State::Idle: // ... break; // ... } This is done using a few preprocessor and template tricks, which will be sketched out in the last part of the article. Besides string conversions and stream I/O, it is also possible to iterate over the generated enums: for (Direction direction : Direction._values()) character.try_moving_in_direction(direction); You can generate enums with sparse ranges and then easily count them: BETTER_ENUM(Flags, char, Allocated = 1, InUse = 2, Visited = 4, Unreachable = 8) Flags::_size(); // 4 If you are using C++11, you can even generate code based on the enums, because all the conversions and loops can be run at compile time using constexpr functions. It is easy, for example, to write a constexpr function that will compute the maximum value of an enum and make it available at compile time - even if the constants have arbitrary values and are not declared in increasing order. I have packed the implementation of the macro into a library called Better Enums, which is available on GitHub. It is distributed under the BSD license, so you can do pretty much anything you want with it for free. The implementation consists of a single header file, so using it is as simple as adding enum.h to your project directory. Try it out and see if it solves your enum needs.

How it works

To convert between enum values and strings, it is necessary to generate a mapping between them. Better Enums does this by generating two arrays at compile time. For example, if you have this declaration: BETTER_ENUM(Direction, int, North = 1, South = 2, East = 4, West = 8) The macro will expand to something like this: struct Direction { enum _Enum : int {North = 1, South = 2, East = 4, West = 8}; static const int _values[] = {1, 2, 4, 8}; static const char * const _names[] = {"North", "South", "East", "West"}; int _value; // ...functions using the above declarations... }; Then, it's straightforward to do the conversions: look up the index of the value or string in _values or _names, and return the corresponding value or string in the other array. So, the question is how to generate the arrays.

The values array

The _values array is generated by referring to the constants of the internal enum _Enum. That part of the macro looks like this: static const int _values[] = {__VA_ARGS__}; which expands to static const int _values[] = {North = 1, South = 2, East = 4, West = 8}; This is almost a valid array declaration. The problem is the extra initializers such as "= 1". To deal with these, Better Enums defines a helper type whose purpose is to have an assignment operator, but ignore the value being assigned: template struct _eat { T _value; template _eat& operator =(Any value) { return *this; } // Ignores its argument. explicit _eat(T value) : _value(value) { } // Convert from T. operator T() const { return _value; } // Convert to T. } It is then possible to turn the initializers "= 1" into assignment expressions that have no effect: static const int _values[] = {(_eat<_Enum>)North = 1, (_eat<_Enum>)South = 2, (_eat<_Enum>)East = 4, (_eat<_Enum>)West = 8};

The strings array

For the strings array, Better Enums uses the preprocessor stringization operator (#), which expands __VA_ARGS__ to something like this: static const char * const _names[] = {"North = 1", "South = 2", "East = 4", "West = 8"}; We almost have the constant names as strings - we just need to trim off the initializers. Better Enums doesn't actually do that, however. It simply treats the whitespace characters and the equals sign as additional string terminators when doing comparisons against strings in the _names array. So, when looking at "North = 1", Better Enums sees only "North".

Is it possible to do without a macro?

I don't believe so, for the reason that stringization (#) is the only way to convert a source code token to a string in pure C++. One top-level macro is therefore the minimum amount of macro overhead for any reflective enum library that generates conversions automatically.

Other considerations

The full macro implementation is, of course, somewhat more tedious and complicated than what is sketched out in this article. The complications arise mostly from supporting constexpr usage, dealing with static arrays, accounting for the quirks of various compilers, and factoring as much of the macro as possible out into a template for better compilation speed (templates don't need to be re-parsed when instantiated, but macro expansions do). Nov. 22 2015: Clarified generated struct pseudocode to show that the size of the enum is set.
Cancel Save
0 Likes 13 Comments

Comments

Hodgman
Neat!
November 20, 2015 12:30 AM
Servant of the Lord

Clever trick with the 'eat' struct; thanks for sharing!

November 20, 2015 01:39 AM
Aardvajk
I likey, thumbs up!
November 20, 2015 08:00 AM
ongamex92

Looks cool, but what about the compile times?

November 20, 2015 08:42 AM
rnlf_in_space

Hopefully we will get reflection in one future C++ standard, then this proposal would really make this all possible without macros. Let's hop for the best :-)

November 20, 2015 10:22 AM
antron

Thanks, all smile.png

@imoogiBG: The compile times are pretty good. For the library, I run comparisons, where I compare including iostream (and not using it) with including enum.h and declaring lots of enums using the macro. Depending on the compiler, you need to generate about 20-30 enums to take up as much time as handling iostream does. Clang is the fastest at the moment. So, while the macro is *much* slower than built-in enums, it is still much faster than parts of the standard library. You can see details here.

@rnlf: I do hope some combination of reflection proposals makes it into C++17.

For now, I "implemented" the enums portion of N4428, but the implementation still requires that the enum be declared with the macro. It's mostly just an exercise, and not very practical – but details are here.

November 20, 2015 01:40 PM
tookie

"it is also possible to iterate over the generated enums"
Great! Best library feature EVER!

Also, this can be a good solution to share enums with Lua

November 21, 2015 05:00 AM
swiftcoder
What software did you use to make the GIF on the GitHub page? I'm oddly fascinated by the typing-animation - it's attention grabbing fo sure.
November 21, 2015 06:25 PM
antron

@TookieToon: Thanks. I'd be curious to know how it turns out if you try it with Lua.

@swiftcoder: I used a couple scripts and LICEcap to capture the actual GIF, but I'm sure any capture program would work. I first wrote a little local web page with a single text area and JavaScript that listens to input events and converts them into a series of time index, letter pairs – this records my typing, and then outputs it as an OCaml list. I then wrote a little OCaml script to play those lists back, together with output, in the terminal. After that, I used LICEcap to make the actual GIF. I guess it's working :)

November 21, 2015 11:20 PM
SmkViper
One downside I see to this solution is it does not use the newer stronger enum system, which means you have the same issues the strong enum system was designed to prevent: lack of type-safety, unable to set the size of enum values, and so on.

That being said, the only way I can think of off the top of my head to fix this is to define the enum outside the struct, and give the struct a separate (but related) name. This gives you the advantages of a strong enum, with the downside being that your utility functions no longer reside "inside" the "enum" when trying to call them.
November 22, 2015 08:11 PM
antron

@SmkViper: TL;DR: The generated struct can be made to use enum class, and you can set the size either way. So, it doesn't have those issues smile.png The long version follows:

The generated struct is actually almost as type-safe as enum class by default. Wrapping in a struct already guarantees a lot of type safety on its own. The only "hole" is implicit conversions to int, which is a side effect of how I enable direct usage in switch. I address this hole in two ways:

First, this is a much lesser evil than being able to convert from int to enum. Even with this hole, you are still guaranteed the same constraints as enum class on how enum values are introduced. You just can't prevent their usage in contexts that don't expect an enum.

Second, in case that argument is not acceptable, there are some simple instructions you can follow to cause the macro to use an enum class internally instead. It then becomes equally as type-safe as enum class, but at the cost of having to write + characters in switch cases.

On top of that, I would argue that, in another way, this macro is more type-safe than either enum or enum class, since the generated struct does not have a default constructor unless you enable it. This prevents unintended zero-initialization or "initialization" with arbitrary values. It gives you more control over how and where enum values are introduced than built-in enums can provide, so you can be even more sure your enums are valid and invariants are being maintained.

In general, I take type safety very seriously. I decided these choices were the most balanced defaults, but of course, these are just my opinions – hence the ability to toggle them smile.png I'm also happy to hear feedback. If people tend to disagree, I will change the defaults.

And, you can set the size of the enums. That's what the second argument to the macro is doing smile.png In fact, you can't not set the size of the enum. The limitation is that you always have to choose it smile.png But I figured it's easy enough for people to just type "int" if they have no particular need for anything else.

The solution you propose with the side-by-side enum and struct is a good and valid one. I've been referring to it as the type traits approach, where the macro generates a regular C++ enum class, and next to it an instance of a template struct containing arrays and a bunch of functions. Instead of being related by name, however, they are related by the enum being the template argument of the struct. So, if you have an enum class State (declared through the macro), you can list its values by accessing enum_traits<State>::values(), etc. You don't lose much by not being able to call functions on instances of the enum – most of the generated functions are static either way.

I have some discussion in the docs as to why I chose (so far) not to do it that way, and there is also an old branch on GitHub that has a demonstration of what the traits implementation might look like. Note that this branch is really outdated relative to master in terms of overall quality. I still think that the question of wrapping in a struct vs. generating a struct alongside is an open one, though, and I'd be glad to hear any comments you have on it.

Both approaches, however, are able to provide a strict superset of the features of enum class, including in the area of type safety. In fact, if you are limited to C++98, the wrapping approach actually brings several enum class features.

I realized that the pseudocode struct in the article was a little misleading on the matter of size, so I updated it.

November 22, 2015 08:53 PM
Promit

I have my own crazy version, built off Boost.Preprocessor for compilers that can't do variadic macros. This version is cleaner in usage, maybe I'll tweak mine to work similarly.

November 23, 2015 04:47 AM
majo33

Thanks, all smile.png

@imoogiBG: The compile times are pretty good. For the library, I run comparisons, where I compare including iostream (and not using it) with including enum.h and declaring lots of enums using the macro. Depending on the compiler, you need to generate about 20-30 enums to take up as much time as handling iostream does. Clang is the fastest at the moment. So, while the macro is *much* slower than built-in enums, it is still much faster than parts of the standard library. You can see details here.

@rnlf: I do hope some combination of reflection proposals makes it into C++17.

For now, I "implemented" the enums portion of N4428, but the implementation still requires that the enum be declared with the macro. It's mostly just an exercise, and not very practical – but details are here.

But iostream is not a great example of proper header file. It's 17k lines of code just for cin/cout/cerr (with static initialization). The iostream should be included in cpp file but not in header (headers should use iosfwd, which contains 1k lines of code).

November 24, 2015 09:14 AM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!

Converting between enums and strings and iterating over enums without unpleasant syntax or external tools.

Advertisement

Other Tutorials by antron

antron has not posted any other tutorials. Encourage them to write more!
Advertisement