Sign in to follow this  
formalproof

How much faster are enums compared to strings?

Recommended Posts

How much faster are enums compared to strings? Comparing two enums requires only one comparison operation, where as comparing two strings of length n will require anything between 1 and n operations (each character pair must be compared in the worst case.) Does anyone have practical experience about this question, especially in game programming? Have you come across a situation where using strings instead of enums was clearly too slow for your purposes?

Share this post


Link to post
Share on other sites
If you try to use very short strings and lookup them in a tree structure, they won't be slow. So it depends on how you use them.

Share this post


Link to post
Share on other sites
Quote:
Original post by the_edd
Quote:
Original post by formalproof
How much faster are enums compared to strings?

Ask your profiler.


I asked for general experience of other game programmers.

Share this post


Link to post
Share on other sites
Quote:
Original post by formalproof
Quote:
Original post by the_edd
Quote:
Original post by formalproof
How much faster are enums compared to strings?

Ask your profiler.


I asked for general experience of other game programmers.
IMO, 'ask your profiler' is a good answer to the question. There are so many variables involved that it's unlikely, I think, that someone else's experience with strings vs. enums would have much (if any) bearing on your particular situation. (That's just my take on it though - I could be wrong.)

In my own personal experience though, string operations (e.g. comparison) have never shown up as a hot spot (I don't recall ever noticing any string-related functions when profiling my applications). Naturally I'm not doing string comparisons in tight inner loops, but I do use them as keys for identifying resources, entities, and so on.

Is there a particular usage scenario that you have in mind? Also, what programming language(s) are you using?

Share this post


Link to post
Share on other sites
Quote:
Original post by formalproof
I asked for general experience of other game programmers.


You said it yourself: enum compare is O(1), string compare is O(n), so enum compares is always faster.

No experience needed!

Edit: How much faster exactly depends on n.

Share this post


Link to post
Share on other sites
Quote:
Original post by jyk
Quote:
Original post by formalproof
Quote:
Original post by the_edd
Quote:
Original post by formalproof
How much faster are enums compared to strings?

Ask your profiler.


I asked for general experience of other game programmers.
IMO, 'ask your profiler' is a good answer to the question. There are so many variables involved that it's unlikely, I think, that someone else's experience with strings vs. enums would have much (if any) bearing on your particular situation. (That's just my take on it though - I could be wrong.)

In my own personal experience though, string operations (e.g. comparison) have never shown up as a hot spot (I don't recall ever noticing any string-related functions when profiling my applications). Naturally I'm not doing string comparisons in tight inner loops, but I do use them as keys for identifying resources, entities, and so on.

Is there a particular usage scenario that you have in mind? Also, what programming language(s) are you using?


No, I don't have a particular scenario in mind. I'm just wondering if we can safely formulate a general rule that strings can be always preferred since the difference in speed is usually neglicible. Similar rules exist for questions such as "should assembly language be used or not?" ("no, unless you really have to"). Has anyone ever encountered a situation where strings are too slow compared to enums?

Share this post


Link to post
Share on other sites
Quote:
Original post by formalproof
No, I don't have a particular scenario in mind. I'm just wondering if we can safely formulate a general rule that strings can be always preferred since the difference in speed is usually neglicible. Similar rules exist for questions such as "should assembly language be used or not?" ("no, unless you really have to"). Has anyone ever encountered a situation where strings are too slow compared to enums?


I always use enums where possible. It's much faster and has the added benefit that typos are caught by the compiler and sensible suggestions are made by code completion.

Share this post


Link to post
Share on other sites
Quote:
Original post by Rattenhirn
Quote:
Original post by formalproof
I asked for general experience of other game programmers.


You said it yourself: enum compare is O(1), string compare is O(n), so enum compares is always faster.
Not necessarily true, if all of the strings differ in the very first character (or in the first couple), then a string comparison will also be O(1). That is, "abc" == "cde" will return false after a single comparison.

That's why you need to ask your profiler.

I do agree that enums are good because they help with intellisense (for example), but I don't think you should choose them because of "performance".

Share this post


Link to post
Share on other sites
I agree with jyk that a profiler is the best to ask since you ask specifically "How much faster are enums compared to strings?" Your question, if aimed at game programming should likely be "What are the differences between strings and enums?"

However, you also partially answered your own question, enums with one compare are faster than strings in almost every situation; depending on string size, and string compare functionality.

For example take the following:


enum MyEnum
{
VALUE1,
VALUE2,
};

bool Compare(MyEnum a, MyEnum b)
{
//Check and return: if a == b
}

bool Compare(const char *a, const char *b)
{
//Check each character until a mismatch, or end of both strings is found.

}

bool Compare4(const char *a, const char *b)
{
//With a little casting magic, and the assumption that both a and b
//will _ALWAYS_ contain 4 characters than check:
return ((*((int*)a)) == (*((int*)b)));
}



Now as you already know string compares are slower. If you are comparing 1 character to 1 character strings, than it -could- be the same cost as enum compare. With unknown characters then you need to setup a loop, iterate over each character, and compare each. Which would be > 2x slower for 2 characters with this setup. 16x slower for 16 character strings... And so on. (This is under the assumption that the string iterates 1 character to a time, and that char compare is the same cost as enum compare.

Now, to put things into true perspective. On previous hardware, some current systems with minimal cycles, and/or areas that are accessed very frequently then this is something to seriously consider and optimize. I use enums in all situations where adding a new value would require some sort of code change anyways, although I use string compares whenever I want something to be data driven. Although it is possible to hash these strings and compare an int or some other value but that is beyond the scope of this post.

For instance, in a project where you have aliens running around, each alien name would string compare, as this would come from a script/file where the alien is setup to know its name, image, and other attributes. But each alien has several attributes: speed, health, damage etc... and each of these can be in an enum since to add 'armor' it would require a code change anyways to add this feature.

So strings are useful, especially in data-driven areas and user-friendliness. But enumerations are faster when comparing, as you already know - and even know the reasons why. As long as you keep that in mind then comparing strings shouldn't be an issue; In my opinion, use what works for you, and if it stops working then look back at it and learn.

However, when comparing 100k items over and over again, the string compare overhead is too costly. And this is something you're profiler will point out. Don't worry about optimizing you're code to unreasonable levels before you ever know where to look. That said, stay in habit of self-optimized code:

Like this:
for (MyIter itr = obj.Start(), itrEnd = obj.End(); itr != itrEnd; itr.Next())
Instead of:
for (My itr = obj.Start(); itr != obj.End(); itr.Next())

That change reduces the function call obj.End() once per iteration since it keeps it around during the loop. This is a minimal difference in the scope of things, but small habits can build up.

Share this post


Link to post
Share on other sites
What are you using the enums for?

It is usually not a question of speed. Either you know all possible values ahead of time, in which case an enum would work, or you don't and you need to use strings.

Another option would be to have a translation layer. You might want the flexibility of strings for storing data in text files, but then once you read in the data you convert the strings to enumeration values and then use the fast enumeration comparison rather than string comparison.

But there is no general answer. They do different things, it depends on the specifics of what you are trying to do.

Share this post


Link to post
Share on other sites
There are other reasons for preferring enums over strings. When using enums you know every possible value that the enumeration type can have while a string can have any value and requires extra code for checking incorrect values (NULL or some random string like "slfjsklqjfml" is probably not a correct value).

An example:
By looking at the function's signature, what can we pass to this method?
void doSomething(string param);


answer: we don't know, it could be anything.
Where can I find the valid values without looking at the function's implementation?
answer: Don't know, maybe it's documented somewhere?

By looking at the method's signature, what can we pass to this method?
void doSomething(MyEnum param);


answer: Any value defined by the enumeration.
Where can I find these values?
answer: Our IDE will probably tell us, or we can find this somewhere:
Enum MyEnum { VAL1, VAL2, VAL3 };


Conclusion:
When using a function that takes an enumeration as a parameter it is easier to know which values can be passed or we know where to look for them.

When writing a function that takes an enumeration as a parameter we know exactly what values can be passed so we don't have to write any code for checking incorrect values.

Share this post


Link to post
Share on other sites
Quote:
Original post by formalproof
No, I don't have a particular scenario in mind.

Yes you do. You are talking about identifiers or keys.
Quote:
Has anyone ever encountered a situation where strings are too slow compared to enums?

When manipulating text, enums will not be of much use.

When using them as keys, they will be faster.

Quote:
When using enums you know every possible value that the enumeration type

In C++ you don't know, since it can contain any value. They are not type safe.

Share this post


Link to post
Share on other sites
Quote:
Original post by Codeka
Quote:
Original post by Rattenhirn
Quote:
Original post by formalproof
I asked for general experience of other game programmers.


You said it yourself: enum compare is O(1), string compare is O(n), so enum compares is always faster.
Not necessarily true, if all of the strings differ in the very first character (or in the first couple), then a string
comparison will also be O(1).

Big-O notation indicates an upper bound, worst case, or limit. Applying it to a conjured scenario of the best case scenario is conceptually flawed -- and you've still done it wrong, since it's necessary not only for all strings to differ in the very first character, but for your program to be able to make that assumption. Otherwise, checking that "abc" == "abc" will evaluate to true requires checking every character -- still O(n).

In other words, the only way it's O(1) is if you've hardcoded the comparison function to false -- which is completely and absolutely useless -- or to compare only the first character -- in which case it's not comparing strings at all, you're comparing characters.


However, yes, do use a profiler -- O(n) only describes the worst case upper bound after all, and "Strings vs enums!" is too apples-to-oranges a comparison to have a general rule about. Alternatively, get specific, and we can pull out applicable, not-so-general rules, which tend to have answers far more nuanced than 'Use strings!' or 'Use enums!'.

Share this post


Link to post
Share on other sites
Quote:
Original post by Antheus
Quote:
When using enums you know every possible value that the enumeration type

In C++ you don't know, since it can contain any value. They are not type safe.

Yes, you have a point right there. That wasn't really valid in the case of C++.

Share this post


Link to post
Share on other sites
Another aspect that you may or may not have overlooked is the size requirements. Every value of a particular enum will always be the same size, but strings have no such requirement by default.

Share this post


Link to post
Share on other sites
Quote:
Original post by Codeka
Quote:
Original post by Rattenhirn
You said it yourself: enum compare is O(1), string compare is O(n), so enum compares is always faster.
Not necessarily true, if all of the strings differ in the very first character (or in the first couple), then a string comparison will also be O(1). That is, "abc" == "cde" will return false after a single comparison.

Not really. Keep in mind that string length and/or terminators have to be taken into account when comparing strings. One of the two strings could be empty or length may not be equal. In that light, enum comparison (which will indeed compile to a single opcode) will always be faster than variable length string comparison.

Share this post


Link to post
Share on other sites
Quote:


In other words, the only way it's O(1) is if you've hardcoded the comparison function to false -- which is completely and absolutely useless -- or to compare only the first character -- in which case it's not comparing strings at all, you're comparing characters.


Sorry, but this is not correct. Many languages, for example C#, provide O(1) comparisons for strings.

IMO, the profiler is the only way to get a valid answer to this question. Even if you are using O(n) string comparisons, chances are it won't be a performance problem in actual use.

Edit: spelling.

Share this post


Link to post
Share on other sites
Quote:
Original post by formalproof
How much faster are enums compared to strings? Comparing two enums requires only one comparison operation, where as comparing two strings of length n will require anything between 1 and n operations (each character pair must be compared in the worst case.)

Does anyone have practical experience about this question, especially in game programming? Have you come across a situation where using strings instead of enums was clearly too slow for your purposes?


Around 100,000 hits per seconds strings don't cut it if that is the main thing the app is doing, though machines are much faster now so maybe 400,000/sec today - meaning the CPU cannot get the work done in time.

Enums, integers, will be one to three *orders of magnitude* faster, that is 10x to 1000x times faster depending on usage.
Strings have a more complex compare, more complex copy, more complex storage, etc... everything about them is "worse" if the same job can be accomplished with a number.
It is usually the copy and the storage that kills you not the compare.

Share this post


Link to post
Share on other sites
Quote:
Original post by Fiddler
Quote:


In other words, the only way it's O(1) is if you've hardcoded the comparison function to false -- which is completely and absolutely useless -- or to compare only the first character -- in which case it's not comparing strings at all, you're comparing characters.


Sorry, but this is not correct. Many languages, for example C#, provides O(1) comparisons for strings.


From reflector:
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
public bool Equals(string value)
{
if ((value == null) && (this != null))
{
return false;
}
return EqualsHelper(this, value);
}


[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
private static unsafe bool EqualsHelper(string strA, string strB)
{
int length = strA.Length;
if (length != strB.Length)
{
return false;
}
fixed (char* str = ((char*) strA))
{
char* chPtr = str;
fixed (char* str2 = ((char*) strB))
{
char* chPtr2 = str2;
char* chPtr3 = chPtr;
char* chPtr4 = chPtr2;
while (length >= 10)
{
if ((((*(((int*) chPtr3)) != *(((int*) chPtr4))) || (*(((int*) (chPtr3 + 2))) != *(((int*) (chPtr4 + 2))))) || ((*(((int*) (chPtr3 + 4))) != *(((int*) (chPtr4 + 4)))) || (*(((int*) (chPtr3 + 6))) != *(((int*) (chPtr4 + 6)))))) || (*(((int*) (chPtr3 + 8))) != *(((int*) (chPtr4 + 8)))))
{
break;
}
chPtr3 += 10;
chPtr4 += 10;
length -= 10;
}
while (length > 0)
{
if (*(((int*) chPtr3)) != *(((int*) chPtr4)))
{
break;
}
chPtr3 += 2;
chPtr4 += 2;
length -= 2;
}
return (length <= 0);
}
}
}


Do point me to the O(1) version in C#. Do note that interning it is also not an O(1) operation, and Object.ReferenceEquals will gladly return false for strings with identical values -- so we can hardly call the combination a proper O(1) string comparison.

Those with 'O(1)' string comparisons are doing the above with all strings, which has it's own issues. Granted, it's well worth mentioning, and I do apologize for thinking it clear I was referring to any custom rolled equality comparison Codeka would have made.

Share this post


Link to post
Share on other sites
Quote:
Original post by rip-off
What are you using the enums for?

It is usually not a question of speed. Either you know all possible values ahead of time, in which case an enum would work, or you don't and you need to use strings.

Another option would be to have a translation layer. You might want the flexibility of strings for storing data in text files, but then once you read in the data you convert the strings to enumeration values and then use the fast enumeration comparison rather than string comparison.

But there is no general answer. They do different things, it depends on the specifics of what you are trying to do.


What I mean is using strings in place of enums (or: in place of named integer constants). This means that I know all possible values beforehand.

I understand that it is hard to come up with a general answer. Still, having a general principle will be much faster than deciding separately in every different scenario. Therefore, I'm mainly interested in this question: has anyone ever come across a situation where using strings in place of enums was definitely too slow? If no one has, then I think I can safely adopt the principle of starting to use strings in place of enums always (and only switch to enums if it really becomes too slow), because strings are more flexible in many ways, like was said above: they can be printed directly for purposes of testing and information display, and you don't need the enum definition in order to interprete the values. Premature optimization is the source of all evil, they say.

Share this post


Link to post
Share on other sites
Quote:
Original post by formalproof
I understand that it is hard to come up with a general answer. Still, having a general principle will be much faster than deciding separately in every different scenario.

In general comparing enums is considerably faster than comparing strings. But this is not a globally applicable principle, since it depends on your usage scenario. The flexibility you get with strings cannot easily be replicated with enums. Either you need the flexibility or you don't.

Quote:
Original post by formalproof
Therefore, I'm mainly interested in this question: has anyone ever come across a situation where using strings in place of enums was definitely too slow?

Of course. Repeatedly.

Quote:
Original post by formalproof
Premature optimization is the source of all evil, they say.

Which is nonsense. Incorrectly done premature optimization is bad. Correctly done early optimization generates inherently efficient code. But what you are trying to do is premature bloatification. Choose the right algorithm and the right data structure for the job. One size doesn't fit all.

Share this post


Link to post
Share on other sites
Quote:
Original post by formalproof

What I mean is using strings in place of enums (or: in place of named integer constants). This means that I know all possible values beforehand.


What is a string, and what is "in place".

Using const char * doesn't require O(n) comparison, but simple integer test, same as enum. I know this can be done in Java with either intern, or equality test, HashMap exploits this fact.

It can be even implemented in same way. If hard-coded, there is no problem, it will be same instance, if obtained from external source, a factory can be used to resolve the value. Again, in Java, it may make sense to use an extra Set in combination with HashMap can be used to work on same instances, even if all strings are loaded at runtime. It improves performance considerably.

Share this post


Link to post
Share on other sites
Quote:
Original post by formalproof
Premature optimization is the source of all evil, they say.

They're conceptually different though, and that advice is aimed at the idea that optimization takes time and may not be even necessary -- but it's not a license to stick your head in the sand and make poor decisions based on "I don't want to think about it" when both choices are just as easy to implement up front.

Yes, I'm quite sure plenty of us have run into situations where naively replacing all enums with strings would make no sense and absolutely tank performance.

There are also situations where enums make no sense (even when all possible values now are known) and the programmer time saved by using strings instead would, if spent optimizing something that actually mattered, more than make up for any performance lost.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this