Tips for keeping a graphics heavy game in 60fps?

Started by
5 comments, last by cozzie 8 years, 1 month ago

Hey guys,

What tips do you have for maintaining 60fps for developing games that are graphically really heavy?

Here are the things i've tweaked:

- Lighting.

- Shadows.

- Camera Draw Distance.

- LOD.

There are 4 image effects which shape the look of the game that can't be sacrificed - but kept to a minimum:

  • SSAO x2
  • Color correction
  • Contrast Enhance

The game i'm developing holds in at 20 frames per second when exported to laptops with these specs:

  • OS: Windows 7 or 8
  • Processor: Dual Core CPU @ 2.4 GHz
  • Memory: 3 GB RAM
  • Graphics: Nvidia GeForce GTX 280 / ATI Radeon HD 5830
  • Storage: 2 GB available space
  • Sound Card: DirectX 9.0c compatible

Here's a quick look at the graphics:

https://goo.gl/HGNi4f

https://goo.gl/8bKxaD

https://goo.gl/E72BWE

Unity based answers are preferred - however, i'm certain this is general knowledge that applies to any engine.

Advertisement

This is a profiling question, so step 1 is always: take measurements.

For a target of 60Hz, you've got a budget of approximately 16ms (16.66 in reality, but you probably want to keep that extra two-thirds of a milli spare for OS spikes).

First measure how much time each part of your scene is taking to render, and map that out against your budget to see where you're currently at.

Then adjust those numbers to where you'd like to be (in this version, they should sum to 16ms or less! laugh.png)

I'm not too experienced with Unity, so I'm not sure how good it's profiling tools are here... As an alternative, there's tools from NVidia/AMD/Intel/Microsoft that you may be able to use to gather this kind of data....

As an example, a frame-capture from my game currently looks like this:

8d5RWX8.png

The total frame time is 10.585 ms in this capture (~94fps). That's already 60Hz, so let's say I'm trying to get to 120Hz (a budget of ~8ms).

I can now look through each of the passes and objects and look for low-hanging fruit. The bigger something is on the graph, the more impact a micro-optimization to it's shader will make. Even cutting out a single multiplication from a shader can have a big impact if it's being used to draw a million pixels.

The two rectangles right before FXAA are Hud at 0.061ms, and Tonemap at 0.198 ms -- so trying to micro-optimize the Tonemap shader probably isn't worth it.

Seeing tonemapping is taking ~0.2ms, a 10% improvement in that shader would bring me from 94.47fps (10.585ms) up to 94.65fps (10.565ms) laugh.png

However, inside the GBuffer pass, I can see that there's an object called "Platforms_Beam_13|geometries_94|Substance_Library1.pewter_002" which alone is taking up 2.293ms, which is anomalous compared to the other objects in the scene -- it alone is over half of the GBuffer pass' cost, and I know that it's a tiny little ring at the bottom of a spotlight in the distance! This immediately stands out to me as a problem, so it would go on the top of my list of problems to investigate. If I can eliminate that anomalous object, then I might be able to bump the frame-rate from ~94fps to ~116fps.

After addressing that problem, I might decide that the sky is taking up too much of my budget -- I'm spending ~1.5ms on lighting, but ~3ms on drawing the background. That intuitively seems out of balance. So the next item on my list would be to go over the algorithms being used by the sky renderer -- are there any algorithmic improvements that can change the big-O cost of the pass? After that, can I simply change the amount of work being done as a quality trade-off -- can I render it at a reduced resolution and then up-scale the results? Mixed resolution rendering for low-frequency-detail passes often delivers performance improvements approaching 4x to 16x. After that, I'd look at micro-optimizing the shader code. Can the code be mathematically rearranged to do the same thing with less instructions? Can multiple shaders be joined into one, or can one big shader be split into multiple simpler ones to make them run on the GPU more efficiently? Can I change the texture formats being used to reduce bandwidth? Can the data be packed better? For these micro-optimization tasks, a shader profiler from AMD/NVidia/Intel is a great tool to have, as it can tell you theoretical performance characteristics of a shader just by looking at the code, letting you experiment with these trade-offs quickly.

After that, maybe I've recovered 1.5ms from the sky pass, and 2ms from the gbuffer pass, which would bring me down to 8ms / 120Hz smile.png

I'd then have to repeat this work for many different scenes and view-points within the game, to ensure that the different passes remain within their budgets at all times. If a particular level causes the renderer to exceed it's budgets, then you can either repeat this work of optimizing the renderer, or work alongside the content creators to help them optimize their level by removing/rearranging object/effect placements.

[edit] Everything I've mentioned above us focusing on GPU time per frame.

It's important though to really understand the difference between GPU frametime and CPU frametime. You need tools that can measure both of them independently from each other. Whichever one is higher, is your "bottleneck". e.g maybe your GPU frametime is 8ms, meaning it could be running at 120Hz, but, your CPU frametime is 50ms, meaning that you're stuck at 20Hz :(

Whichever processor is the bottleneck, is the one that you should optimize for first. Usually if the GPU is the bottleneck, then the CPU will spend a lot of time idling inside a function like SwapBuffers, Flip, or Present -- which is where it waits for the GPU to finish the previous frame's commands.

e.g. another shot from my engine -- the rendering thread on the CPU submits a whole load of draw-calls (the rainbow of stripes), and then gets stuck inside the Present function, indicating that the GPU is running too slow, causing the CPU to wait for it to catch up.

LdkCNjX.png

If the CPU is actually your bottleneck, then micro-optimizing shaders is useless. Instead, as frob mentions below, you probably need to look into reducing batch counts to reduce the amount of work that the CPU rendering thread is requried to perform.

It is common in Unity games for beginners to have a very high number of draw calls, many of them are poorly batched. Features real-time lighting, real-time shadowing, and reflection probes look pretty and are easy to add, but have costs beginners don't know to control. Those with small budgets tend to not create carefully-crafted texture atlases and reusable shaders. Couple the two -- multiple renderings of unbatched scenes requiring many draw calls -- it creates big problems very quickly.

Unity's profiler is available in the free version these days, and the tool is quite easy to use to track down the big performance blockers. Fortunately there are free or low-cost tools out there that can help. For high draw calls and poor batching I've worked with a few that let you put all your static resources underneath a parent object and at run time it will merge them all together into a small number of static batches; not as good as some skilled artists, modelers, and technical artists could do, but far easier since you're really just dragging a few items in the hierarchy. Similarly, a few small tweeks to other features like lighting can switch to a mixed mode where most is baked and then merged with the non-static elements.

There are other small changes that are easy to make, it all depends on what the actual bottlenecks are. Use Unity's built-in profiler, look for the big things, then fix them, or search the Internet to see how other people fixed them.

This is a profiling question, so step 1 is always: take measurements.

+1


most people start with random tweaks when they want to improve performance. some "stupid" ideas turn out good, while others don't, but because there was never a profiling, you don't know why and neither if there was a simpler/smarter way.

profile profile profile.
once you see what is going on, it starts to become approachable.

Unity has hardly ever been known for performance out of the box.


Honestly, that profiler as hodge mentioned can tell you a lot about whats going on. Unity has one built into it's system too.

It can seriously be any number of things. A script that's eatting too much time. Too many graphics probes. bad batching... which honestly I cincerely doubt will be a major issues with those empty corridors.

Making a guess, I'd say take a look at your particles and your lighting. Your lighting isn't heavy, but it's complex. It looks like you put a light directly inside of a lamp, which is projecting shadows onto the enviorment in an awkward way.

lastly... it's also very possible you don't have the scene being properly culled... so it's rendering everything. EVERYTHING.

As other have said, for emphasis:

Step 1: Find your bottleneck... is it really the GPU? Unity is extremly Drawcall intensive (many objects get rendered in multiple draw calls even if a simple shader is used) -> this creates CPU bottlenecks, as the render thread is stuck on a single core.

Step 2: reduce the amount of draw calls if they are the problem (easy to find with the Unity profiler. If the CPU takes most time, most of the CPU cost is graphics, and you see like 4000 draw calls in the stats, that is your problem #1).

Combine meshes as much as you can. You can do it offline in a 3D Tool like Blender, you can write your own Unity tool to do it, or there are third party assets in the store if you can spend something to do it. Reduce the amount of non-batchable objects (moving objects for example).

Check your lights. If you are using multiple lights with shadows, they will contribute quite a lot to your draw calls.

Step 3: If the problem is on the GPU side, see if you are using multiple lights. If you are using a forward rendering path, you should use only a single main light, and go easy on additional lights as the forward path is really not made to handle multiple light sources.

Check your shadow settings. Realtime shadows are EXTREMLY expensive, especially if cast from multiple light sources. Disable shadows for objects that do not need them (too small, like grass or small rubble). Bake shadow maps and only activate realtime shadows for lights that do need them (the ones where it would look off if moving objects wouldn't cast a shadow)...

There are options to "fake" shadows for moving objects. If shadows play a big part in your game, you have a lot of moving objects, and you want to keep using Unity and its default renderer (other renderers available in the asset store CLAIM to use a different method to draw shadows that MIGHT be performing better), consider "cheating" and only using shadow maps and fake shadows (for example blob shadows, or decals) to give moving objects some shadows.

Step 4: aside from the graphics, consider other bottlenecks. Are you Instanciating stuff? DON'T! EVER! Write object pools, instantiate at startup, and the recycle everything. Never, ever instanciate stuff during a running game. I have found that among rendering a massive amount of instanciated meshes with physics rigidbodys causing FPS dips, the instanciating was the big problem... not the meshes or drawcalls (seems the 100 or so objects didn't make too much of a difference as soon as I made sure they do not cast shadows), not the physics (seems PhysX can handle 100 rigidbodys just well on the single thread it was running on). Instanciating and destroying was the culprit. Once I started using an object pool and only deactivated and moved objects before reactiving them again, instead of FPS going down to 20-ish from the normal 35 FPS (yes, the game was not well optimized at the time), it just dipped 1-2 FPS for all the objects flying around (I did additional steps to limit the amount of rigidbodys at any time, but this is besides the topic).

I repeat: Never, ever instanciate or destroy anything during game runtime.

Step 5: Really think about if the weedy laptop you are using there should be minimum or recommended hardware. If it is minimum, maybe lower graphics settings.

Step 6: Just to reiterate Tangetails point: make sure there is culling going on. Without using Umbrella Occlusion Culling, Unity will always render everything. If you are using a large level, you might be rendering like 100's of rooms hidden by the wall in front of you. Worst case you built the level from components (with every wall being its own object), and didn't batch them. That might be 1000's of drawcalls just for looking at a wall! Not to mention all the lights and shadows not being culled.

Easiest way to spot such issues is to turn around in a fairly empty room. If the FPS/Drawcall/other stats vary wildly depending which direction you turn too, while the visible geometry stays more or less the same in complexity, you are not properly culling.

My few cents:
- profiling, as mentioned
- don't draw what you don't see
- don't try to develop specific components which others can do better
- reuse data where possible
- use the right components for the right tasks (CPU, GPU, types of memory etc)

Crealysm game & engine development: http://www.crealysm.com

Looking for a passionate, disciplined and structured producer? PM me

This topic is closed to new replies.

Advertisement