Compiler Optimizations

Started by
7 comments, last by Zahlman 17 years, 10 months ago
Suppose one had a utility class that hashed a string parameter, something like this: Hash myhash("hash this string"); And by hash I mean it generated a unique(enough) id for it, using crc or something perhaps. Is there a way to get the compiler to resolve that at compile time into the integer hash code since the string is static and known at compile time? If not, would it be useful to perhaps do something like this: static const Hash myhash("hash this string"); So it only happens once and then it can be passed and re-used in the same function. As a more practical example: void Character::TakeDamage(int _dam) { g_SoundSystem->PlaySound(Hash("Player.TakeDamage")); // hash it once, pass the hash for more efficient lookups over a string lookup, or generating the hash every call. // OR static const Hash takeDamage("Player.TakeDamage"); g_SoundSystem->PlaySound(takeDamage); // INSTEAD OF g_SoundSystem->PlaySound("Player.TakeDamage"); // where it must hash it every call. } Seems to me perhaps that the static method might be best to ensure it happens once and then can be re-used. My main question is if there is a way the compiler would essentially optimize the creation of the Hash instance down to just initializing the numerical result of the hash since the input is known at compile time.
Advertisement
I thought I recalled seeing something about this in one of the Game Programming Gems books, but I am unable to find it at the moment. Apparently it is not as easy to "look inside the book" at Amazon as it once was.

Though it does not exactly generate compile time constants, this article provides an example of a crc hash algorithm using template meta-programming that will allow the process to be completely un-rolled during compile time, making it very fast to execute during run-time.
I'm pretty sure that there is no way to manipulate strings at compile time.
If all strings can be resolved at compile time, why don't you just use enums to index an array ?

Anyways, when using a hash function, you can precalculate the hash values as you suggested. Some compilers may be able to replace the result of the hash function by a single value (in case the code of the hash function is available in the same compilation unit and inlined), but in general I doubt that most compilers will do this. But you can easily check this by yourself using a dummy function (compile it and have a look at the object code).

Not all of them, just some. The static ones are probably simplest way to do once and re-use the hash in those cases.
Quote:Original post by Nitage
I'm pretty sure that there is no way to manipulate strings at compile time.


That depends on the kind of string, C++ doesn't have a native string type (it have basic_string<>, but that's a library class). Strings (const char[]) can be passed as template parameters, but unfortunatly operator[] can't be used upon it. Most compilers have a very hard time optimizing anything using strings (both char* and std::basic_string<>). If you created a compile-time string which allowed access at compile time it would be easy to reimplement a hashing algorithm using template meta-programming. Here is a simple example (which is not using Boost). There is a little bug I didn't bother to fix, if your string length is n and n%20==0 and n>0 you will get a compile time error. You specify a string by using ct_str and seperating every character with a ',', when you reach character n where n%20==0 and n>0 then you are supposed to use a new ct_str (ct_str<'a','b' ... 't', ct_str<'u','v' ... 'z'> >). Of course this is quite ugly, but it will be perfectly optimized (number hardcoded in assembly). I don't use a standard hashing algorithm, instead I just add every character to the hash code, but most stuff could easily be implemented.
#include <iostream>typedef unsigned char char_t;//Compile Time STRingtemplate<	char_t ch0='\0',	char_t ch1='\0',	char_t ch2='\0',	char_t ch3='\0',	char_t ch4='\0',	char_t ch5='\0',	char_t ch6='\0',	char_t ch7='\0',	char_t ch8='\0',	char_t ch9='\0',	char_t ch10='\0',	char_t ch11='\0',	char_t ch12='\0',	char_t ch13='\0',	char_t ch14='\0',	char_t ch15='\0',	char_t ch16='\0',	char_t ch17='\0',	char_t ch18='\0',	char_t ch19='\0',	typename next = void>struct ct_str{	template<unsigned int index>	struct access_ch	{		const static char_t value = typename next::access_ch<index-20>::value;	};	// I use macros here to avoid copy-paste, the lesser of two evils#define CAT2(x,y) x##y#define CAT(x,y) CAT2(x,y)#define CREATE_ACCESS_STRUCT(index)	template<>	struct access_ch<index>	{		const static char_t value = CAT(ch,index);	}	CREATE_ACCESS_STRUCT(0);	CREATE_ACCESS_STRUCT(1);	CREATE_ACCESS_STRUCT(2);	CREATE_ACCESS_STRUCT(3);	CREATE_ACCESS_STRUCT(4);	CREATE_ACCESS_STRUCT(5);	CREATE_ACCESS_STRUCT(6);	CREATE_ACCESS_STRUCT(7);	CREATE_ACCESS_STRUCT(8);	CREATE_ACCESS_STRUCT(9);	CREATE_ACCESS_STRUCT(10);	CREATE_ACCESS_STRUCT(11);	CREATE_ACCESS_STRUCT(12);	CREATE_ACCESS_STRUCT(13);	CREATE_ACCESS_STRUCT(14);	CREATE_ACCESS_STRUCT(15);	CREATE_ACCESS_STRUCT(16);	CREATE_ACCESS_STRUCT(17);	CREATE_ACCESS_STRUCT(18);	CREATE_ACCESS_STRUCT(19);	// Others shouldn't use these#undef CREATE_ACCESS_STRUCT#undef CAT2#undef CAT};template<typename T>struct to_hash{	template<unsigned int index,bool done>	struct to_hash_impl	{			const static unsigned int value =			to_hash_impl<index+1, ('\0' == T::access_ch<index+1>::value) >::value + T::access_ch<index>::value;	};	template<unsigned int index>	struct to_hash_impl<index,true>	{		const static unsigned int value = 0;	};	const static unsigned int value =  typename to_hash_impl<0,false>::value;};int main(int,char**) {	std::cout << to_hash< ct_str<'h','e','l','l','o',',','w','o','r','l','d','!'> >::value << std::endl;	std::cin.get();	return 0;}


I don't believe anything like this is possible with standard strings (char const[], or basic_string<>).
If you're feeling brave, you could probably do it with some template metaprogramming... :)
I have occasionally written special preprocessors to do this sort of thing. You run the source file through the preprocessor, it messages the source appropriately, then feeds the result through the compiler.

This method has its drawbacks. Depending on how robust and non-intrusive you want your processor to be writing it can be tricky. Newbies to the project can get confused because they don't know this is going on. The preprocessor adds an extra level of dependencies that need to be fully built before your source gets built. etc. None of these are particularly huge problems but they can add up if not managed carefully.
-Mike
If you're thinking of what I think you're thinking about, then you probably just want to implement symbols. The idea is that you use the pointer value for a string literal itself as the hash, i.e. make use of the property of "object identity" (and never mind that char*'s are a long way from being proper objects). Something like:

struct ltstr {  bool operator()(const char* s1, const char* s2) const {    return strcmp(s1, s2) < 0;  }};class SymbolSet {  std::set<const char*, ltstr> symbols;  std::vector<const char*> table;  SymbolSet(const char* const* const table, int size) :     symbols(table, table + size), table(table, table + size);    // XXX XXX XXX Only pass string literals to the ctor!    // We are working with the pointer addresses so we don't want any    // duplicate strings, and we're not going to be doing any memory allocation.  const char* operator[](const std::string& name) const {    std::set<const char*, ltstr>::iterator x = symbols.find(name.c_str());    return x == symbols.end() ? NULL : *x;  }  const char* operator[](int enum_value) const { return table[enum_value]; }};// You probably want to set up some kind of macro to generate enum/char*[] pairs.enum {PLAYER_TAKEDAMAGE, PLAYER_SCREAM, ENEMY_GROWL} sounds_e;const char* sounds[] = { "Player.TakeDamage", "Player.Scream", "Enemy.Growl" };SymbolSet ss(sounds);// If I have a numeric ID from the enumeration, I can get the value quickly:Sound::play(ss[PLAYER_SCREAM]);// If I have a runtime-constructed string, I can still get the value slowly:Sound::play(ss[std::string("Player.") + eventName]);


Now, either way, you can access a char* value that serves both as a pointer to the string data and as a unique identifier of the symbol (so you could use it for a key to look up some *other* associated data if you need to - maybe your Sound module caches sound file objects rather than looking up the file every time, and you want to map from the file name to the existing sound file object if already loaded, and load from disk otherwise...). If that makes you uneasy, well first off good for you :) and second, you should be able to just create a Symbol class which wraps a char* (again it should only ever point at a literal) and provides a .str() member function.

This topic is closed to new replies.

Advertisement