Odd Performance Question

Started by
23 comments, last by Spoonbender 18 years, 1 month ago
How is your rendering loop structured?

Doing:

render
update
swapbuffers


can be dramatically faster than the alternatives.

Advertisement
render
update
swapbuffers


Yes, I use this pattern. However, it's the update part that kills me. Using static data just to test the render, I get 0-2%s CPU util.
Got a chance to fully test my code again. I found a peak CPU util of 11% with un-loop and a peak of 30% without it (use nested for loop). It makes a HUGE difference in performance. My code is a little bloated but I got the performance I wanted (would like more increases, of course).
Quote:Original post by Spoonbender
Quote:Original post by Anon Mike
Bah, people are always making this stuff more complex than it has to be. "loop unrolling" is all you really need to think about. All that minuscule-level processor giberish merely becomes more efficient because of the unrolling.


Er no. Loop unrolling is exactly what doesn't make much difference.
"Bah, people are always assuming cpu performance is simpler than it is." [wink]
You can unroll the loop as much as you like, and you won't see more than a few percent improvement in performance.


I think you misunderstood my point. I'm not denying that pipelining, branch prediction, vectorized operations, yadda yadda yadda don't make a significant difference. My comments were directed at the original code which was a simple instance of loop unrolling and doing this in turn allowed more opportunities for the compiler and the processor to do thier stuff automagically. i.e. I was trying to gently push the newbies rather than tossing them off the deep end. If thier resulting mental model of what's going on isn't quite perfect then that IMHO is ok - they can ask more sophisticated questions when they're ready.

Quote:Oh, and this "minuscule-level processor giberish" can result in 2.5x speed improvement (quoted in above article). I'll take it!


Meh, I'll take shorter, simpler, easier to understand & maintain code any day unless it's proven that a particular piece of code is in the critical path.
-Mike
Quote:Original post by Anon Mike
I think you misunderstood my point. I'm not denying that pipelining, branch prediction, vectorized operations, yadda yadda yadda don't make a significant difference. My comments were directed at the original code which was a simple instance of loop unrolling and doing this in turn allowed more opportunities for the compiler and the processor to do thier stuff automagically.

Nope, I didn't miss your point. It's not loop unrolling though. Take a look at the code again. [smile]

This would be loop unrolling:
Quote:
for(int i = 0, i < 100, i += 2)
{
sum += list;
sum += list[i+1];
}

and it wouldn't have made a noticeable difference. What made the difference was storing the result into two different variables.

Quote:Oh, and this "minuscule-level processor giberish" can result in 2.5x speed improvement (quoted in above article). I'll take it!

Quote:
Meh, I'll take shorter, simpler, easier to understand & maintain code any day unless it's proven that a particular piece of code is in the critical path.

Well, technically it is in the critical path as long as it's executed at some point... But it might not be where most of the execution time is being wasted. And you're right, of course. It's that little "unless" that matters.

fathom88: What does your program do? Sounds like it spends a lot of time in loops like these. What's it for? [wink]

And what do you mean when you talk about cpu utilization? Are we talking abotu the cpu usage reported in task manager, or anything like it?
It should be at 100% unless you do something silly like calling Sleep().

This topic is closed to new replies.

Advertisement