Quote:Original post by Spoonbender
And sheesh, I'm surprised everyone (except possibly blaze02) missed this. [wink]
Screw the branching issues. It doesn't take a complex branch predictor to handle something like this loop. What matters is the data dependencies that prevent it from pipelining efficiently.
Most of my experience has been with 486 assembly. I haven't read up on the pentium+ pipelines. Yeah, for single loops, the compiler will "guess" the correct next operation like 99% of the time. In fact I wouldn't worry about anything this low level ever; or you might as well be writing assembly code. You are talking 1-2 microseconds difference (AT BEST!) with this optimization. The reason games slow down is an entirely different issue. Spend your time learning what graphic render states take the longest to swap in and out and group accordingly. If you plan on handling a ton of objects, look into the possibility of a hash_map instead of a vector or list. These types of speed-ups you will really be able to see and use.