C++ enum names as strings
#1 Members - Reputation: 181
Posted 03 March 2007 - 06:31 AM
#2 Members - Reputation: 2407
Posted 03 March 2007 - 06:52 AM
When I needed something like this a couple of years ago, I wrote a little program that took a text file as input, for example say we had fruits.txt:
apples
oranges
pears
and generated a .h file like:
#ifndef fruits_H
#define fruits_H
enum fruits { apples,oranges,pears };
const char *fruits_str[]={ "apples","oranges","pears" };
#endif
Obviously this was a pretty trivial program to implement.
#4 Members - Reputation: 1008
Posted 03 March 2007 - 06:59 AM
enum fruits
{
apples,
oranges,
pears
};
std::ostream& operator<<( std::ostream& os, const fruits& fruit )
{
switch( fruit )
{
case apples: os << "apples"; break;
case oranges: os << "oranges"; break;
case pears: os << "pears"; break;
}
}
I suppose you could write a program to generate this as well.
#5 Moderators - Reputation: 1340
Posted 03 March 2007 - 08:07 AM
Alternatives exist at the IDE-level. Using vim, Emacs, or Visual Studio you can use a script/macro that generates a string table from an enumeration automatically. Vim already has such a script here. It should be trivial to write one for Visual Studio using C#/VB.NET.
#6 Members - Reputation: 1900
Posted 03 March 2007 - 08:12 AM
std::map<Uint16,std::string> formats = boost::assign::map_list_of
(AUDIO_U8, "AUDIO_U8" )
(AUDIO_S8, "AUDIO_S8" )
(AUDIO_U16LSB, "AUDIO_U16LSB")
(AUDIO_S16LSB, "AUDIO_S16LSB")
(AUDIO_U16MSB, "AUDIO_U16MSB")
(AUDIO_S16MSB, "AUDIO_S16MSB")
(AUDIO_U16, "AUDIO_U16" )
(AUDIO_S16, "AUDIO_S16" )
(AUDIO_U16SYS, "AUDIO_U16SYS")
(AUDIO_S16SYS, "AUDIO_S16SYS");
Advantages of this approach include:
1. Easier to maintain than a switch statement
2. More flexible than overloading operator<<()
3. Enumerants can have arbitrary values (as opposed to the array method)
4. 'Invalid' enumerants won't lead to UB (as opposed to the array method)
#7 Members - Reputation: 3325
Posted 03 March 2007 - 05:58 PM
jyk's suggestion though is clearly superior in the face of enums which aren't simply 0->MAX.
#8 Members - Reputation: 1554
Posted 03 March 2007 - 06:58 PM
Quote:
Original post by EasilyConfused
You are correct that you can't do this in a built in way with C++.
When I needed something like this a couple of years ago, I wrote a little program that took a text file as input, for example say we had fruits.txt:
apples
oranges
pears
and generated a .h file like:
#ifndef fruits_H
#define fruits_H
enum fruits { apples,oranges,pears };
const char *fruits_str[]={ "apples","oranges","pears" };
#endif
Obviously this was a pretty trivial program to implement.
Here's a way to avoid putting more tools into your build chain (but it uses boost, and worse than templates --- it uses macros):
#include <boost/preprocessor.hpp>
#define SEQ (apples)(oranges)(pears)
#define TO_STR(unused,data,elem) BOOST_PP_STRINGIZE(elem) ,
enum fruits { BOOST_PP_SEQ_ENUM(SEQ) };
const char * fruit_strings[]={ BOOST_PP_SEQ_FOR_EACH(TO_STR,~,SEQ) };
#undef SEQ
#undef CAT
EDIT: Enterprisey-fied:
#include <boost/preprocessor.hpp>
#define PROJECT_PREFIX_DO_EVIL_TO_STR(unused,data,elem) BOOST_PP_STRINGIZE(elem) ,
#define PROJECT_PREFIX_DO_EVIL(enum_,strings,elements) \
enum enum_ { BOOST_PP_SEQ(elements) }; \
const char * strings[]={ BOOST_PP_SEQ_FOR_EACH(PROJECT_PREFIX_DO_EVIL_TO_STR,~,SEQ) };
//-----------------------
PROJECT_PREFIX_DO_EVIL(fruit,fruit_strs,(apple)(pear)(pineapple))
PROJECT_PREFIX_DO_EVIL(animal,animal_strs,(dog)(cat)(panda)(monkey))
(edit/note: trailing whitespaces to preserve \s inserted)
#9 Members - Reputation: 468
Posted 03 March 2007 - 07:51 PM
Quote:
Original post by Simian Man
Another approach is to overload operator<<
enum fruits
{
apples,
oranges,
pears
};
std::ostream& operator<<( std::ostream& os, const fruits& fruit )
{
switch( fruit )
{
case apples: os << "apples"; break;
case oranges: os << "oranges"; break;
case pears: os << "pears"; break;
}
}
I suppose you could write a program to generate this as well.
Try that and include this header from several different cpp's... Now
take a hex-editor and search the generated exe for "apples" - the
entire lookup table will be there as often as you included it.
Regards
#10 Members - Reputation: 1360
Posted 03 March 2007 - 09:47 PM
Quote:
Original post by Muhammad Haggag
This is rather clever, even though a bit ugly.
Here is a simplified version:
You could also put the elements in an include file instead of a macro:
#define DAYS_OF_THE_WEEK \
ENUM_OR_STRING( Sunday ), \
ENUM_OR_STRING( Monday ), \
ENUM_OR_STRING( Tuesday ), \
ENUM_OR_STRING( Wednesday ), \
ENUM_OR_STRING( Thursday ), \
ENUM_OR_STRING( Friday ), \
ENUM_OR_STRING( Saturday )
// Enum
#undef ENUM_OR_STRING
#define ENUM_OR_STRING( x ) x
enum DaysOfTheWeek
{
DAYS_OF_THE_WEEK
};
// Strings
#undef ENUM_OR_STRING
#define ENUM_OR_STRING( x ) #x
char * DaysOfTheWeekStrings[] =
{
DAYS_OF_THE_WEEK
};
---- DaysOfTheWeek.h
ENUM_OR_STRING( Sunday ),
ENUM_OR_STRING( Monday ),
ENUM_OR_STRING( Tuesday ),
ENUM_OR_STRING( Wednesday ),
ENUM_OR_STRING( Thursday ),
ENUM_OR_STRING( Friday ),
ENUM_OR_STRING( Saturday )
---- source file
// Enum
#undef ENUM_OR_STRING
#define ENUM_OR_STRING( x ) x
enum DaysOfTheWeek
{
#include "DaysOfTheWeek.h"
};
// Strings
#undef ENUM_OR_STRING
#define ENUM_OR_STRING( x ) #x
char * DaysOfTheWeekStrings[] =
{
#include "DaysOfTheWeek.h"
};
John BoltonLocomotive Games (THQ)Current Project: Destroy All Humans (Wii). IN STORES NOW!
#11 Members - Reputation: 2407
Posted 03 March 2007 - 10:59 PM
Quote:
Original post by Kitt3n
Try that and include this header from several different cpp's... Now
take a hex-editor and search the generated exe for "apples" - the
entire lookup table will be there as often as you included it.
Thinking about it, that would be equally true of the example I provided. I must have used that approach back in the early days when I used to just preprocess my entire program into one translation unit. [smile]
I guess to use my autogenerated sourcefiles approach properly, the program would have to generate a .h and .cpp file. Given the trouble and complexity this starts creating, I'd probably go more with something like jyk or JohnBolton or others have suggested.
#12 Members - Reputation: 1900
Posted 04 March 2007 - 09:15 AM
std::map<Uint16,std::string> formats = boost::assign::map_list_of
(AUDIO_U8, "AUDIO_U8" )
(AUDIO_S8, "AUDIO_S8" )
(AUDIO_U16LSB, "AUDIO_U16LSB")
(AUDIO_S16LSB, "AUDIO_S16LSB")
(AUDIO_U16MSB, "AUDIO_U16MSB")
(AUDIO_S16MSB, "AUDIO_S16MSB")
(AUDIO_U16, "AUDIO_U16" )
(AUDIO_S16, "AUDIO_S16" )
(AUDIO_U16SYS, "AUDIO_U16SYS")
(AUDIO_S16SYS, "AUDIO_S16SYS");
Oops! Some of these enumerants are aliases for each other. The *SYS variants are aliases for the *SB variants as determined by the endianness of the system, while *16 aliases to *16LSB.
Furthermore, I didn't notice the error until I compared this version of the code with a version that inserted the map elements manually. Because map_list_of inserts the elements in a different order than the corresponding 'manual' code, the end effect of the aliasing was different in each case. Subtle :-|
Now, unless I'm missing something else obvious, this doesn't negate any of the advantages of this method (over arrays or switch statements) mentioned earlier. However, although using a map means you don't have to worry about the particular values of the enumerants, you do have to know if any of them alias each other. This is particularly important when dealing with a third-party API rather than your own code (I had to check the corresponding SDL header to know that *16 aliases *16LSB, which isn't necessarily intuitively obvious).
#13 Moderators - Reputation: 1666
Posted 04 March 2007 - 07:56 PM
Quote:
Original post by jyk
However, although using a map means you don't have to worry about the particular values of the enumerants, you do have to know if any of them alias each other.
Which is why I would generate code rather than using tools like boost::assign. :)
#14 Members - Reputation: 1900
Posted 05 March 2007 - 04:45 AM
Quote:Sure, although I'd be interested to know how you'd apply this solution in the particular case that I presented.
Original post by Zahlman
Which is why I would generate code rather than using tools like boost::assign. :)
Unless I've missed something, in both of the 'generated code' examples presented earlier in the thread the enums themselves are generated, not just the associated strings. Therefore the values are known, and furthermore the values are known to be sequential and zero-based. The corresponding strings are then stored in an array.
In the case I presented the enumerants come from a third-party library. Although one can of course examine the appropriate header file, let's say for the sake of argument that the values are not known. How would one automate the generation of associated strings in this case?
Replacing map with multimap in my previous example solves the problem and makes the solution robust in the presence of arbitrary values for the enumerants, including those that are aliases of each other. However, this comes at the cost of added complexity elsewhere in the code.
So although I'm sure you're right, I'm not quite clear on how your suggestion to use code generation rather than 'naive' application of a map specifically addresses the problem I presented earlier. I'd certainly be interested in seeing a concrete example - you always seem to have interesting tricks up your sleeve :-)
#15 Members - Reputation: 1583
Posted 05 March 2007 - 06:10 AM
// some defines in an "enum-helper" file
#define GEN_OUTPUT(X)
#define GEN_ALT_OUTPUT(X)
#define GEN_IGNORE
// ======================================================
// An enumeration definition, which is tagged
// to generate an output function
GEN_OUTPUT(generated.foo.hpp)
enum foo {
// use the default name and value
apples,
// set the numeric value ourselves
bananas = 35,
// set the displayed text ourselves
GEN_ALT_OUTPUT(green lemon) lime = 34,
// we don't want this value to be used
GEN_IGNORE ignored = 35
};
#include "generated.foo.hpp"
// ======================================================
// An example of generated file:
const char _foo_apples[] = "apples";
const char _foo_bananas[] = "bananas";
const char _foo_lime[] = "green lemon";
STATIC_ASSERT(apples != bananas);
STATIC_ASSERT(bananas != lime);
STATIC_ASSERT(apples != lime);
std::ostream& operator<<(std::ostream& out, const foo& f)
{
switch(f)
{
case apples: return out << _foo_apples;
case bananas: return out << _foo_bananas;
case lime: return out << _foo_lime;
default: assert(false);
}
}
Optimizations would include transforming the switch statement into an if-else tree based on the likelihood of each enumeration value.
If enumeration values overlap, the generated operator would fail to compile. It's also possible to generate a set of static assertions, as illustrated above, which would cause failure in a cleaner way.
#16 Members - Reputation: 1900
Posted 05 March 2007 - 08:08 AM
Quote:Let me describe more clearly the particular case from which my example was drawn.
Original post by ToohrVyk
A generation tool can read the enumeration description from a file and output the correct display function. For instance:
*** Source Snippet Removed ***
Optimizations would include transforming the switch statement into an if-else tree based on the likelihood of each enumeration value.
If enumeration values overlap, the generated operator would fail to compile. It's also possible to generate a set of static assertions, as illustrated above, which would cause failure in a cleaner way.
SDL uses a number of macros to identify various audio sample formats. Some are aliases for others, and some are switched based on the endianness of the platform. I'll 'paraphase' the relevant portion of the header file here:
#define AUDIO_U8 < unique integer value >
#define AUDIO_S8 < unique integer value >
#define AUDIO_U16LSB < unique integer value >
#define AUDIO_S16LSB < unique integer value >
#define AUDIO_U16MSB < unique integer value >
#define AUDIO_S16MSB < unique integer value >
#define AUDIO_U16 AUDIO_U16LSB
#define AUDIO_S16 AUDIO_S16LSB
#if ENDIANNESS == LITTLE_ENDIAN
#define AUDIO_U16SYS AUDIO_U16LSB
#define AUDIO_S16SYS AUDIO_S16LSB
#else
#define AUDIO_U16SYS AUDIO_U16MSB
#define AUDIO_S16SYS AUDIO_S16MSB
#endif
#define DEFAULT_FORMAT AUDIO_S16SYS
The purpose of the code I posted is relatively simple: to associate with these values human-readable strings for output to a logging system. The log records, among other things, whether the requested specs match the queried specs, and what the requested format translated to on the system in question (e.g. DEFAULT_FORMAT becomes AUDIO_S16MSB on a PowerPC Mac).
The question then is how best to automate the generation of string representations for these values, and whether it's worth the trouble. It seems to me that none of the examples presented thus far, including the example you posted above, are directly applicable in this case without significant modification (the fact that the enumerants are macros is incidental - the same would be true were they elements of an enum).
I'm happy to concede the point based solely on your and Zahlman's considerable expertise, but I would still be interested in seeing how the proposed solutions could be applied here without undue difficulty.
#17 Moderators - Reputation: 1666
Posted 05 March 2007 - 09:34 PM
OK, let's say we have a header file with an enum (I'd rather not touch the problem of converting stupid #define usage into enumeration usage [wink] ):
#ifndef AUDIO_H
#include AUDIO_H
enum AUDIO {
AUDIO_U8 = < unique integer value >,
AUDIO_S8 = < unique integer value >,
AUDIO_U16LSB = < unique integer value >,
AUDIO_S16LSB = < unique integer value >,
AUDIO_U16MSB = < unique integer value >,
AUDIO_S16MSB = < unique integer value >,
AUDIO_U16 = AUDIO_U16LSB,
AUDIO_S16 = AUDIO_S16LSB,
#if ENDIANNESS == LITTLE_ENDIAN
AUDIO_U16SYS = AUDIO_U16LSB,
AUDIO_S16SYS = AUDIO_S16LSB,
#else
AUDIO_U16SYS = AUDIO_U16MSB,
AUDIO_S16SYS = AUDIO_S16MSB,
#endif
DEFAULT_FORMAT = AUDIO_S16SYS
}
#endif
Now, the aliasing problem is one that *can't* be resolved perfectly, for the simple reason that information is lost - a value of type AUDIO with value AUDIO_U16 is identical to a value of type AUDIO with value AUDIO_U16LSB, so there is no way to know which symbol was used in the source code. (After all, we can also create a variable of type AUDIO by reading in an int from a file and doing an explicit cast). However, let's say arbitrarily that we will resolve these problems by always stringizing an enum value according to the *first* enumerant in the enum with the appropriate value. Thus, in effect, the way to deal with aliased values is simply to *ignore* them ;)
(That is, a multimap doesn't help: there is no way to determine which value to select. So pragmatically, we have to just pick one, which takes us back to using a plain map. Incidentally, the scheme I propose automatically causes 'DEFAULT_FORMAT to become "AUDIO_S16MSB" on a PowerPC Mac', since that's the first enumerant with that value, so that's what will be used for the stringization. I suspect this simple heuristic will be best, really.)
We then write our script as follows:
- Invoke the preprocessor on the header file, i.e. ask the compiler what the
system endianness is ;)
- From the preprocessor's output, parse out the enum declaration.
- Initialize an empty associative array from integer values to strings.
- For each enumerant:
- Determine the int value.
- If it is not found in the associative array, add it, associating it with
the stringized version of the enumerant.
- Output code which initializes a std::map<int, const char*> with the
contents of our associative array, by iterating over our AA's keys and
generating corresponding map.insert() statements. (Better yet, write code
which wraps the whole thing up in a class. We can use a single class and
create a global static instance of it for each enum, and do the initialization
by clever use of operator chaining.)
Since implicit conversion does happen from the enumeration *to* an int, we can use enumeration values to look up the name in the std::map just fine.
The class might look something like this - all completely off the top of my head at 4:30 AM, but damned if it doesn't look good to me right now ;)
template <typename E>
class Enumeration {
// Map values must always be string literals! We will never do any memory
// management here; we freely copy pointers, and at the end, the static
// section of the executable data is cleaned up in one go, and there are no
// leaks or double-deletes.
typedef const char* symbol;
typedef std::map<int, symbol> table_t;
table_t table;
static symbol nil;
static Enumeration instance;
int nextkey;
public:
// We will use operator chaining in order to initialize a single static
// instance that's accessible to preprocessor magics. :) No need to make this
// a Singleton; this class only gets instantiated by our auto-generated code.
// Callers don't need to know it exists to use it ;)
Enumeration(symbol name, int value = 0) : nextkey(0) { (*this)(name, value); }
Enumeration& operator()(symbol name, int value = nextkey) {
assert(value >= nextkey);
nextkey = value + 1;
table[value] = name;
return *this;
}
symbol operator()(E value) {
table_t::iterator it = table.find(value);
return (it == table.end()) ? nil : it->second;
}
};
template <typename E>
Enumeration<E>::symbol Enumeration<E>::nil = "";
// If we don't trust the compiler to share that string constant, we could force
// that by making a separate "" constant instead and having nil alias it...
// Now, the templating so far just looks like it gives us type-safety for the
// operator()(E). But actually it will let us implement some obscene syntactic
// sugar as well, if I'm thinking clearly ;)
template <typename E>
ostream& operator<<(ostream& os, const E& e) {
return os << Enumeration<E>::instance(e);
}
// And thus we accomplish the claimed goal of users not needing to know about
// the Enumeration class. For non-enumeration types, this should not cause any
// interference, due to SFINAE - again, if I'm thinking clearly... if not, I'm
// sure there's a way we can make it work... ;)
And we just write that code once; our auto-generated code just has to initialize Enumeration<E>::instance for each typename E that is appropriate (i.e., each enumeration in the program). We just emit something like:
Enumeration<AUDIO>
Enumeration<AUDIO>::instance("AUDIO_U8")("AUDIO_S8")("AUDIO_U16LSB")
("AUDIO_S16LSB")("AUDIO_U16MSB")("AUDIO_S16MSB");
And that should work even at top level, because we're just initializing the variable; no procedural code here, nope, no sir ;) Even if that doesn't work, though, I'm fairly sure that "Enumeration<AUDIO> Enumeration<AUDIO>::instance = Enumeration<AUDIO>(etc....)" will.
#18 Anonymous Poster_Anonymous Poster_* Guests - Reputation:
Posted 06 March 2007 - 06:33 AM
Example:
.h:
class MyEnumeration {
private std:string name;
public:
static MyEmuneration enum1;
static MyEmuneration enum2;
static MyEmuneration enum3;
...
static map stringMap();
int ordinal();
std::string name();
private:
int ordinal;
std:string name;
void MyEnumeration(int ordinal, std::string name);
}
.cpp:
MyEnumeration MyEnumeration::enum1(0, "enum1");
MyEnumeration MyEnumeration::enum1(1, "enum2");
MyEnumeration MyEnumeration::enum1(2, "enum3");
MyEnumeration MyEnumeration::map();
void MyEnumeration::MyEnumeration (int ordinal, std::string name) {
this->ordinal = ordinal;
this->name = name;
stringMap->add(name, this);
}
int MyEnumeration::ordinal() {
return ordinal;
}
std::string MyEnumeration::name() {
return name;
}
If you need to map back from ordinals to enumerations, you just add another map and modify the constructor . If you need an iterator, you can provide one, etc.
#19 Members - Reputation: 168
Posted 06 March 2007 - 06:42 AM
You could also avoid passing the ordinal in the constructor with a static ordinal field that gets incremented every time the constuctor is called. This way you won't have to pass an ordinal value to the constuctor. Also, you could implement next() and prev() functions or even use the STL's iterator mechanism.
#20 Crossbones+ - Reputation: 491
Posted 06 March 2007 - 12:57 PM
Quote:
Original post by JohnBolton Quote:
Original post by Muhammad Haggag
This is rather clever, even though a bit ugly.
Simplified cleverness removed.
Belmont, CA, huh? Did you by any chance work at Oracle at some point? I did, and their code was full of this stuff. I believe there was one header file that got included 7 times into a source file doing cleverness like the above. Ultimately I think this path ends up being more confusing than helpful; at least it confused the hell out of me! Having experienced that, I'd probably be more likely to go with matching defines and strings, and some sort of asserts to be sure things stay in sync.
By no means am I denigrating what you've suggested; it's probably the most "correct" way to do something like like in C/C++. I just want to issue this cautionary tale.
Thanks,
Geoff






