Jump to content
  • Advertisement
Sign in to follow this  
DzoniGames

Game Optimization

This topic is 990 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

?Hey!

 

?So for the past month in a half I have been working on optimizing it because the FPS was quite low even on my High-Spec PC. I've used a few techniques like occlusion culling but I've had an idea which I couldn't really google because I didn't know how to define it in short words.

?Here is what I thought of doing:

?So when you have an if statement and it checks multiple things, let's say checks if value 1 is equal to 10 and values 2 is equal to 13. So it's checking 2 things and thus it's using more data, not much more but imagine it on hundreds or even thousands of if statements that check for 2 or more things. So what I've thought of is checking only if value 1 is equal to 10 and in it another if statement, so a nested one, checks if value 2 is equal to 13, so it wont look at the nested if statement until value 1 is equal to 10 thus ever so slightly increasing performance. And again this doesn't eat up performance that much but I have a lot of if statements that check for more than 5 things. I'm not sure if anyone does this and I've only been working with computers for 3 years now and didn't know much 'bout them till about a year ago and I wasn't that good at coding so I don't know how much this will increase performance or if it will actually do something with the performance so please tell me your opinion on this.

 

Thank you for reading :)

 

PS: I'm sorry for the title being so undescriptive but I didn't want to put "Game Optimization Theory" or something like that because maybe people actually do this and I didn't know about it. 

 

Share this post


Link to post
Share on other sites
Advertisement

it's not even for performance reasons, but required to make your code work properly e.g.

if(pCamera!=nullptr && pCamera->IsActive()==true)...

you don't want the 2nd statement to be always evaluated, as it would lead to a memory access violation.

Share this post


Link to post
Share on other sites

Most compilers tend to be fairly smart about things like this.  Generally low-level optimizations like this one are easy for the compiler to recognize and it can be more efficient at it because it has all the information it needs.  What you should be concentrating on is high-level optimizations where you have information that the compiler doesnt.

Share this post


Link to post
Share on other sites

I've seen many cases where a set of branches could be optimized to a single branch and a jump table by replacing if(){}elseif(){}else{} with a switch statement, and the conditions are complex enough for the compiler to not be able to know that. Although there's no guarantee of how the compiler will implement a switch statement, it will generally be faster than a bunch of branches. This is something to look into.

Another common case is branches being ordered poorly. I generally try to order branches in order of their likelihood, for the sake of optimizing branch prediction. For example, if you know that one condition will be true in 95% of cases, it's a good idea to put the branch for this condition first, so that the processor doesn't have to branch through less predictable branches in order to get to the predictable one.

Sometimes it's hard to predict what branch will be taken (maybe it depends on user input, or the contents of some file that can be replaced or edited). For example, if one branch is a "fast path" (IE it returns immediately), and the other branch is a "slow path" (IE it has to do some complex calculations, process a bunch of memory, etc), and maybe a third branch that pretty much shouldn't happen (maybe it just throws an exception or logs an error or something); you want to evaluate the fast path first.


but chances are your game doesn't have enough branch mispredictions to reduce fps significantly (if your fps is 30, that means it's drawing every 33.3ms. You want to draw at least as 60fps, every 16.7ms. The average branch misprediction cost is 20ns - .00002ms. You'd need to shave off about 830,000 branch mispredictions per game loop. Do you even have that many branches?) The problem probably lies somewhere else.

Edited by nfries88

Share this post


Link to post
Share on other sites

Echoing some posts from above: that technique may net you 1-2 FPS increase after digging through all of your code. It may be worth doing, but it's likely something else causing the slowdown. It would be more effective to set some timers before and after your functions to see what's eating up your speed. Once you've got a list of your slowest functions, you can prioritize which ones need to be more effecient. Then you can optimize those where possible.

 

Depending on your code, some functions may not need to be run every loop. Some, like AI for example, can be run every 5th or 10th loop. If that creates a bottleneck, you can put the objects that need to call it in a que and then run it X times each loop.

Share this post


Link to post
Share on other sites

Yes, I recommend profiling your game loop and determining where it is spending the most time. I typically log such information in debug (read: development) builds so I can choose where to dedicate my optimizing hours.

A typical timer function with millisecond resolution will probably not be suitable for this - probably many functions worth optimizing will return 0 for this, because they operate in less than 1ms. You need a higher resolution timer function. In C++11, there is std::chrono::high_resolution_clock::now().

Share this post


Link to post
Share on other sites
The high_resolution_clock is not always very good. Some implementations are particularly bad, such as Visual C++ 2012 and 2013 operating at multi-millisecond resolution.

On the PC you can probably use __rdtsc() intrinsic to read the CPU's time stamp counter, which is supported on most major compilers.

Share this post


Link to post
Share on other sites


The high_resolution_clock is not always very good. Some implementations are particularly bad, such as Visual C++ 2012 and 2013 operating at multi-millisecond resolution.

I use GCC so I really did not know this. What did they do, wrap GetTickCount() instead of QueryPerformanceCounter()?


On the PC you can probably use __rdtsc() intrinsic to read the CPU's time stamp counter, which is supported on most major compilers.

that intrinsic is limited to Visual C++ and as it only returns cycle counts elapsed since power-on it also fails as a constant measure of time - power settings can change the processor's clock frequency. There is also no guarantee of tsc being the same on each processor on a system, and there is no guarantee that the thread requiring tsc running on the same processor at all times. It can produce spurious errors and should not be used.
On Windows, if high_resolution_clock does not have at least 1usec resolution, use QueryPerformanceCounter instead. Linux and BSD variants have clock_gettime(CLOCK_MONOTONIC). OS X and iOS require using mach kernel time - mach_absolute_time()
 

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!