Jump to content
  • Advertisement

Archived

This topic is now archived and is closed to further replies.

FrancoisSoft

Fast Loop

This topic is 5811 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I''m in the process of completing the levels for a new 2D game that I''m working on and I''m getting real close to creating the game engine. In the previous game I created, which was 2D, I use C-style loops but I didn''t have too many objects to deal with. But now with my new which will have long levels and many of them I''m going to need to make fast loops to have the game speed be acceptable. Just to let you know I wrote both games C++. Here is an example of the loops I would use in my previous game (Quackoo vs Tie Fighter)... Example 1 for (int x=0;x);x++) // Do some stuff... while(!infile.eof()) // Do more stuff... See that''s what I mean. I think my loops are just too slow. If you have any suggestions please tell me. I really need help with this.

Share this post


Link to post
Share on other sites
Advertisement
well....

you can''t really speed up a loop as such (i.e. a for loop is a for loop is a for loop). you can speed up the logic performed in each iteration of a loop which is what i think you''re asking. some things you just flat out have to do for every object in the game -> movement, collision detection, etc. you can speed up collision detection and other types of things that only happen to "close together" object by creating a spatial partitioning scheme like BSP or OctTrees. that way the collision detection part of the loop isn''t an O(n^2) algorithm.

your best case for an update loop is probably something like O(nlogn), but i''m really just making that up . a draw loop can similarly be made much more efficient by using the same spatial partitioning scheme to decide upon Potentially Visable Object Sets. sniff around the Articles & Resources Game Programming section (i think that''s the one with all the spatial partitioning schemes).

did i answer your question?

-me

Share this post


Link to post
Share on other sites
quote:
Original post by FrancoisSoft
I''m in the process of completing the levels for a new 2D game that I''m working on and I''m getting real close to creating the game engine. In the previous game I created, which was 2D, I use C-style loops but I didn''t have too many objects to deal with. But now with my new which will have long levels and many of them I''m going to need to make fast loops to have the game speed be acceptable. Just to let you know I wrote both games C++. Here is an example of the loops I would use in my previous game (Quackoo vs Tie Fighter)...

Example 1


for (int x=0;x);x++)
// Do some stuff...

while(!infile.eof())
// Do more stuff...


See that''s what I mean. I think my loops are just too slow. If you have any suggestions please tell me. I really need help with this.



The best way to quicken your loops is to unroll them. In other words, instead of looping 100 times, loop 10 times and repeat the instruction 10 times in the loop body. That''s an old DOS trick that Lamothe gave in his original Tricks Of The Game Programming Gurus... I don''t think you''ll see that much of an improvement but it should help out a bit. It''ll save you a couple of cycles per iterations.





[Cyberdrek | the last true sorcerer | Spirit Mage - mutedfaith.com]

Share this post


Link to post
Share on other sites
Here is an other loop question:

How does the infinite loop for( ;; ) { ... }
translate intto assembly. I mean, if I do not
have a init, cond and expr in the loop, does
the compiles also leave out the tests andstuff?

LOOP: ...
jump LOOP

Share this post


Link to post
Share on other sites
Like Cyberdrek said.

Unrolling the loop will cut down the loop overhead. A simple loop
like this:


      
// Calculate offset and determine memory address end before loop

while(offset<offsetEnd) // one compare, one conditional branch

{
destination[offset]= source[offset++] + 1; // one increment

} // one unconditional branch



Unrolling the process 10 times would cut the loop overhead down to about a 10th. However, this will bloat the code size and may cause cache problems for the CPU so don't go nuts.


  
// Calculate memory end offset and determine memory address before loop

while(destination<destinationEnd) // one compare, one conditional branch

{
destination[0]= source[0] + 1;
destination[1]= source[1] + 1;
destination[2]= source[2] + 1;
destination[3]= source[3] + 1;
destination[4]= source[4] + 1;
destination[5]= source[5] + 1;
destination[6]= source[6] + 1;
destination[7]= source[7] + 1;
destination[8]= source[8] + 1;
destination[9]= source[9] + 1;
destination += 10; // one addition

source += 10; // one addition

} // one unconditional branch



If there where 100 MOVs to be performed the loop overhead in the unrolled loop whould be 10 compares, contional and unconditional braches instead of 100. There are 20 additions in the unrolled vs 100 increments in the original. (Cutting out 350 instructions, sorry don't know the estimated cycle time that could save)

Or something like that.

Anyway unrolling loops will only have the biggest impact if the loop over head has a close ratio to the process like 1:2 but if it were 1:1000 unrolling the loop won't do much. (besides that process would take ~4000-5000 cycles)

Also: I hope an empty loop is converted to a simple unconditional jump by the complier!! I guess that depends on the complier.


[edited by - CodeJunkie on November 20, 2002 12:38:03 AM]



[edited by - codejunkie on November 21, 2002 2:50:31 PM]

Share this post


Link to post
Share on other sites
I''m not really trying to confuse anyone, but "That''s an old DOS trick" is quite true. Ever since the Pentium and branch prediction unrolling loops has become in many cases *more* cpu intensive than a normal loop. This is because branch prediction allows the processor to avoid misses within a loop (effectively making the conditional branch take 1 cycle, even appear to take "0" cycles on some). At the same time an unrolled loop can strain your cache (especially on a P4), which could mean 2 cache misses for *every* iteration of the loop.

Share this post


Link to post
Share on other sites
Good Point Michalson.

What size code will cause a cache miss then?
Could you possibly get away will unrolling a loop just a few times if its small enough?
I mean the loop has some overhead doesn't it? (yeah it might not be much thou)

Thanks

[edited by - CodeJunkie on November 20, 2002 9:10:52 PM]

Any help with organizing code in C/C++ to take advantage of new CPUs pipelining to avoid cache misses would be great.

[edited by - CodeJunkie on November 20, 2002 9:30:58 PM]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Template specialization for unrolling a loop entirely can be good.

Share this post


Link to post
Share on other sites

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!