Is using one the switch statement better then using multiple if statements?

Started by
20 comments, last by frob 7 years, 10 months ago

I cannot spot when and how switch can have outperformed proper conditionals in any way on any platform- and by my logic I conclude it cannot.

The reason has already been covered in this very thread: switches are easy to convert to lookup tables in machine code. Conditional statements are less easy. So in a common case you will get better machine code generated for a switch than a if/else ladder. Some compilers are better at this than others.

And the reason why it is easier for compilers to optimize it is because it gives more information for the compiler to work with, so they can detect common patterns of behavior easier that the compiler is familiar with and knows optimization tricks for.

For the same reason, in C++, these two do the same logic:


for(size_t i = 0; i < array.size(); ++i) //Regular 'for' statement
{
   Element &element = array[i];
   Meow(element);
}

for(Element &element : array) //Range-for statement
{
   Meow(element);
}

...but beside them doing the same logic, the second version is easier for the compiler to optimize, because the structure of the code itself communicates information to the compiler. The second version guarantees to the compiler that every element will only be visited once, and that the elements will be traversed in a specific order, and that all the elements are known in advance - the first version guarantees none of those things.

In simple cases, the compiler will likely optimize them both the same, but in more complex cases, the compiler may not be able to figure out the optimization of first example, and so may use slightly lesser optimizations instead. This may result in super teeny-tiny gains or losses in performance, which 99.999% of the time don't matter.

Basically, the structure of our code becomes contracts to the compiler making guarantees that help the compiler guide the optimizations. Different code structures make different guarantees.

else-if() chains make different guarantees than a switch() does, and the extra information switch() communicates can help the compiler more easily recognize certain types of optimizations that else-if() chains might, in some cases, be more harder for the compiler to detect.

But if the OP is asking "Which one should I use?", then the usual criteria applies: Use whatever is cleaner/clearer to whoever has to read and expand and maintain the code, and don't optimize prematurely. When it comes time to optimize, profile first to find out what is slow, then focus on the critical parts, and after realizing that architectural macro-optimizations benefit way more than function-level micro-optimizations, and finally-finally-finally, micro-optimize only the areas that require it.

In my own code, I tend to find else-if() chains more visually appealing in some cases, and switch()'s cleaner in other cases, and so have plenty of each.

And every one of switch statements used in my code, 100% of the time they were chosen because it makes the code clearer, not for any optimization purposes.

If you are using switch() statements for micro-optimizations, there are other tricks to be aware of also; putting your more-frequently used branches closer to the top of the switch() supposedly helps, though I've never tried it. When you get to that level of micro-optimizing though, that's when something like profile-guided-optimizations provide the compiler with even better information about the real-world usage your code would be put through.

Advertisement

First, you misslead the readers on what switch is. Since switch is named so improper, switch is not a switch, unless there is break keyword instruction in every sub-block of condition .
What switch does is that it evaluates all conditions in their order and executes their subblock if they are true. Logicily switch is equal to this

{// the escape block

if (A){}

if(B){}

if ©{}

{} // the default:

}

What? Thats not even close to true. A switch without break will jump to the matching label, an execute all labels after that.

So this switch


switch(c)
{
case 'A:
	A();
case 'B':
	B();
case 'B:
	C();
default:
	D();
}

Is not even close to equal to what you have posted, but really if you want some code that is executonally equivalent than this is it:


if(c == 'a')
{
	A();
	B();
	C();
	D();
}
else if(c =='b')
{
	B();
	C();
	D();
}
else if(c == 'c')
{
	C();
	D();
}
else
{
	D();
}

But really that is quite pointless, I think we all can agree that if talking about switch we are talking about switch with "break" labels. Not having break from AFAIK is mostly bugs, or shorthands for reducing duplicate code if blocks are shared:


switch(direction)
{
case UP:
case DOWN:
     // y-movement
     break;
case 1:
case 2:
     // x-movement
     break;
}

@Servant:

If you are using switch() statements for micro-optimizations, there are other tricks to be aware of also; putting your more-frequently used branches closer to the top of the switch() supposedly helps, though I've never tried it.

Does it? I've heard that this can help for regular if/else-chains (on said "dumb" compilers), but from switch all I've heard is that you should put labels in order, like always go 0 on top to 10 on bottom, best case without holes (so 0..10 or 10...0 is optimal, "10, 5, 9, 1" not so much). Reason I've read is that this makes it much easier for calculating where to jump to for the compiler, though I belive any compiler where if/else can be compiled to same assembly as switch should be able to figure out ordering on their own.

But really that is quite pointless, I think we all can agree that if talking about switch we are talking about switch with "break" labels. Not having break from AFAIK is mostly bugs, or shorthands for reducing duplicate code if blocks are shared

Or to trip up students with Duff's Device. :lol:

@Servant:

If you are using switch() statements for micro-optimizations, there are other tricks to be aware of also; putting your more-frequently used branches closer to the top of the switch() supposedly helps, though I've never tried it.


Does it? I've heard that this can help for regular if/else-chains (on said "dumb" compilers), but from switch all I've heard is that you should put labels in order, like always go 0 on top to 10 on bottom, best case without holes (so 0..10 or 10...0 is optimal, "10, 5, 9, 1" not so much). Reason I've read is that this makes it much easier for calculating where to jump to for the compiler, though I belive any compiler where if/else can be compiled to same assembly as switch should be able to figure out ordering on their own.

I've heard both, but have tested neither. :)

Maybe the assumption is that the ones near the top (if the compiler doesn't reorder them) are more likely to have their instructions already loaded into the instruction cache? I have no clue; in either case, I'd rather optimize at the function and system level, and if I have to do micro-optimizations (which I never have needed to do), I'd use explicit compiler intrinisics and play with PGO to see if that provides any tangible benefits.

First, you misslead the readers on what switch is. Since switch is named so improper, switch is not a switch, unless there is break keyword instruction in every sub-block of condition .
What switch does is that it evaluates all conditions in their order and executes their subblock if they are true. Logicily switch is equal to this


Switch works differently in some languages (what you said is true of ActionScript), but we're talking about C++ here.

In C and C++, and probably a bunch of other languages, a switch statement that uses constant integer cases (or integer-like at the assembly level) which are fairly close together (i.e. case 1, case 2, case 3, etc...) are converted into something like this (MSVC, x86):


cmp (int case), (constant N)  // where N is the size of the lookup table
jae breakLabel // if case >= N, skip the switch statement
jmp [lookupTable + 4*case] // single x86 instruction which jumps directly to the case block's code.
// code for each case of the switch statement goes here.
// breakLabel - represents where all of the cases go when they break.
// code for the rest of the function
// end of the function's code, start of the function's jump tables
// 0-3 byte padding to optionally align the lookupTable
// lookupTable of N * 4 bytes representing a pointer array, with each element pointing to the address of each case's code to execute.
// (more jump tables, if the function has multiple switch statements)
// optional padding to 16 byte align the next function
// start of the next function
When two cases have the same code:


case 1:
case 2:
   // some code
   break;
The compiler puts the same pointer in the table for cases 1 and 2.

When the range of integers is not near zero, the compiler will add/subtract a suitable offset when creating its table. It does the optional 'add/sub' instruction before the out-of-range 'cmp'.

When there's a gap in cases, say you have "case 1, case 2, case 4", the compiler puts a 'break' address in the lookup table for every gap.

If the cases are too far apart (case 1, case 900, case 1570, case 1571, case 1572 etc), the compiler doesn't waste space making a lookup table that large. Instead it can split the switch up into a combination of a binary if/else tree and switch statements, depending on the distribution of your cases.

I have also encountered one additional helper array for switch statements - it's a byte array which maps case numbers (0-255) to jump table indices. It's used in cases where there are a lot of case values but only a small handful of shared code. The byte array is used to reduce the overall size of the jump table. There might be a short variant of this, but I haven't seen a case statement with more than 256 separate blocks of code yet.

Performance is a question of your run-time branching, performed instruction saves, navigating, not at all a question of wheather it was an if or a switch. But frankly, I cannot spot when and how switch can have outperformed proper conditionals in any way on any platform- and by my logic I conclude it cannot.


The reason has already been covered in this very thread: switches are easy to convert to lookup tables in machine code.

Well, let me get in here. (compile time look-up)

This is true only in case that "case" expression are equality checkings of alternatives of a switched value -of an integer value. And in case of that being true - what prevents you from an array indexed function pointers/delegates/whatever imidietaly acquired? Too much more comforming, or not? If not, why?

I am not bashing switch as is, I am just saying that an impotent c++ feature recomendation based on performace blasphemy is too ridiculous?

It is wheather you check any subjective object in program for some exexution that is sufficuent, then end (a return?). Or any other.

Bashing on people with compiler time pro\missed look-up table is a huge non-sense. But they will trust you and abuse this. I know.

I like a saying "If you're playing with fire, you're gonna get burned". Enough of micro/pseudo (de)optimizations, the only proposition that seemed reasonable to me here is SOTL , elsewhere- we are just missleading beginners and trusters.

... Duff's device ...

Duff's Device was an interesting optimization about 40 years ago when different conditions existed.

On today's hardware it introduces essentially the worst case pipeline stall. All major compilers take steps to detect the thing and replace it with alternatives that fit better with today's hardware.

All optimizations depend on details of the system. Last year's details are different from this year's details, and next year's details will be different again.

Many optimizations that were once fast now introduce problems. Duff's Device, the XOR Swap, they're terrible on deeply pipelined processors. Fancy memory tricks are terrible on new processors, while memory is faster than it used to be every new chip is relatively more faster, meaning memory access is more and more costly as time goes on. Calculating trig functions like sin() and cos() are faster than lookup tables thanks to relative speed differences, though the reverse was true on most processors about 15 years ago. Decrementing loops to continue when equal to zero hasn't been an optimization on x86 for two decades. Low memory addresses are no longer faster than higher ones and haven't been for nearly three decades. Writing to shared objects in continuous pools is absolutely terrible for cache coherency on multi-core processors, and hasn't been viable for speedups since the 1970s. Manually marking every little function for inline compilation is usually a terrible decision in modern C and C++, it is typically better to provide information to the compiler and allow it to decide what works best with the target caches and memory performance for the target architecture. (That last one is so bad most compilers now completely ignore the inline keyword unless you force them to.) Etc., etc., etc.

Yet all these little tips and tricks are still found in books and occasionally taught online as though they were gospel truth for modern software performance, never mentioning that the conditions that made them true have changed, and now they slow processing down.

... what prevents you from (options)? Too much more conforming or not? If not, why? I am not bashing switch as is, I am just saying that an impotent c++ feature recomendation based on performace blasphemy is too ridiculous?

The performance difference in the general case is so small it doesn't really matter in the real world.

If a programmer is in a situation where the performance of this condition is critical --- and I honestly cannot fathom such a case --- then they would probably not be using C++ switch statements for the code.

The famous quote applies here: Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. -- Donald Knuth

This is part of those enormous amounts of time.

Much has been written about this quote but it withstands the test of time. Don't do stupid things that introduce inefficiences, don't introduce pessimizations, but almost always this is something you shouldn't bother with. If you have additional knowledge that the compiler doesn't, such as knowing that a specific path is highly likely, then do something that uses the knowledge. In that rare case, do something. If you know you can avoid nearly all the work with a simple test, then do the simple test. But don't worry about it.

The best advice I've heard for optimization is:

Beginner: Don't optimize your code.

Advanced: Don't optimize your code yet.

Expert: Don't optimize your code until you have proven with instrumentation that it needs improvement, then verify and document the changes for those who come after.

The vast majority of the time it just doesn't matter. Your compiler is smart, and can find smart ways to optimize this. The compiler may use jump tables, or a branching if/else tree, or a series of conditionals, or even a binary search. Next year's compiler may have additional techniques available and it may do something different. Profile-guided optimizations may allow your compiler to recognize even more advanced things. ... But none of that typically matters to the programmer.

I'm not saying make intentionally bad code, don't write bad code if you can help it. If you know something specific is a concern then write the simple version with a comment of //TODO: This is probably slow, measure it with a profiler someday.

If a person has chosen to use C++ then they need to let the C++ compiler do its job. That job includes optimizing the code. Let your compiler do its job.

If (1) you are using C++ for you code and (2) you are concerned about the performance of a single switch statement, one of those two things is a flaw. Most likely you can find far bigger performance gains in other areas of the code by swapping out algorithms or fixing up data structures and access patterns. Alternatively, if the performance is absolutely critical in this section and the implementation of a switch statement is paramount, then the development should be done in assembly language where you can control it rather than c++.

And since we're off track enough at this point, let's get back to the original with all these debates in place:

(1) I am currently learning C++ and I was just wondering if using the switch statement is better and/or a better and more efficient way of using multiple if statements?

(2) Is there advantages to using multiple if statements?

(3) If there is what are they?

(4)Are switch statements good in game development?
(1) Switch statements are a tool in the language. There are many tools available for alternating behavior. See the above discussion. The efficiency of a switch statement is not typically a performance concern in the real world.
(2) If a switch would have worked but the programmer choose a series of if statements, that decision would be about specific control. If you have specific knowledge for a situation against it, a switch statement may not be the best solution. When the problem naturally fits a switch statement, use a switch statement. Note that a compiler might rearrange your if statements according to the optimization rules as allowed by C++, and might even implement it exactly the same as it would have implemented a switch statement. Again, this is not typically a performance concern in the real world.
(3) There are many others options. Jump tables, function pointer tables, branching if/else trees, binary searches, conditional operations, virtual functions, hand-coded assembly with code tuned for specific processors, to name a few. Various options fit different problems more naturally than others. The compiler is programmed with all kinds of details about the target processor and the optimizer can pick and choose between many different options when it encounters a switch statement. Exactly how it is implemented internally doesn't matter, that's an abstraction in c++. However, this situation is not typically a performance concern in the real world.
(4) Yes, switch statements are used all the time in game development. As are branching is/else trees, jump tables, virtual functions, and the rest. You should generally use the tool that your problem suggests as a solution. In C++, if you've got a bunch of potential routes of code and the decision of which branch is based on an integer value, then a switch statement is a natural fit. If there are other similar situations, those may be a better fit. If instead of branching based on an integer value you want to select behavior based on the type of an object, a virtual function is probably a better fit. If instead you are branching based on the value in a string, a series of if/else branches may be better. If you have some other situation, a different solution may be a more natural fit for it.
I am not bashing switch as is, I am just saying that an impotent c++ feature recomendation based on performace blasphemy is too ridiculous?

You are running in open doors with this. Nobody is recommending to use switch as a go-to based on performance. The first few posts were about whether switch is easier to optimize than if/else-chains, and every person still added that one should not worry too much about it based on performance, but more readbility/clearity.

So I don't know if you misread my post or the ones of some others, but as is you are just repeating what others have said.

All major compilers take steps to detect the thing and replace it with alternatives that fit better with today's hardware.


An amusing possible consequence of this, which I haven't been bothered to test, is that if you code up a Duff's Device you may yet get better performance than if you code up some other hand-optimized, but misguided, alternative. But the better performance won't be coming from the Duff's Device....

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

I also apologise. I sometimes forget my rhetoric gets behind.

All major compilers take steps to detect the thing and replace it with alternatives that fit better with today's hardware.


An amusing possible consequence of this, which I haven't been bothered to test, is that if you code up a Duff's Device you may yet get better performance than if you code up some other hand-optimized, but misguided, alternative. But the better performance won't be coming from the Duff's Device....

I'd guess the answer is no. The rather infamous discussion from XFree86 (which used it extensively) found that the compiler's rewrite was generic and very large. Their own rewrites were far more compact.

The compiler makes a good general attempt at avoiding the worst-case scenario but the expansion is huge. Humans can do better by replacing the algorithm entirely with a better fit.

This topic is closed to new replies.

Advertisement