• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
cozzie

Profiling results, GPU or CPU bound?

8 posts in this topic

Hi,

After some refactoring and adding new features in my engine, I've been doing some profiling.

I did it on both a simple scene and a more complex scene.

 

You can see all results here:

http://www.sierracosworth.nl/gamedev/2014-07-29_profiling/

 

It's actually quite fun to check your code against one full frame and all 'underlying' d3d calls that are done.

 

But now you think, what's the question?

 

In the screenshot below you'll see an example of 'peaks' I'm getting (Stutter) in total frametime. Now I'm trying to find out it they are cased by either too much work for the GPU (CPU is waiting) or is the GPU waiting because the CPU isn't delivering enough. My first thought would be that the GPU is too busy to handle everything the CPU delivers.

 

What are your thoughts?

 

complexer_scene_gpu-cpu3-whats-this.jpg

 

0

Share this post


Link to post
Share on other sites

The standard way to work out what the bottleneck is is to give the GPU less work to do, and see if it goes any faster. If it does then you're not completely CPU bound. You need to do this while still making the same set of draw calls. There are various ways to do this, including:

 

- Reduce the screen resolution or render target size.

- Set a tiny scissor rectangle.

- Replace pixel / vertex shaders with simple ones.

 

Having said that, if you're looking at single frame spikes with D3D a common cause is using something (shader / texture / vertex buffer / etc.) for the first time. This is because D3D drivers are generally lazy and only fully initialize things on first use. While this saves on doing work for things you never use, in most cases it's really unhelpful. The workaround is to do a bunch of off screen draw calls on the loading screen that make use of every texture and shader. That means you don't have to wait for the driver to, for example, upload a bunch of textures to video RAM when the end of level boss appears.

2

Share this post


Link to post
Share on other sites

Hi adam, thanks.

I've played around a bit and did a new test without normal mapping (79 out of 134 materials), now the peaks we're all gone.

After this I also was told that the number of backbuffer (in my case 1, so double buffering), could be the issue. So I upped that to 2 (triple buffering). Now I get a lot less peaks.

 

So I assume that the GPU was waiting for new stuff till V-sync time passed, now with an extra buffer there's enough to render.

What do you think?

0

Share this post


Link to post
Share on other sites

In general profiling with vsync enabled isn't very useful, because the delays waiting for the vsync tend to hide the real performance. I'd recommend doing all profiling with vsync off.

 

Enabling triple buffering can hide even more performance issues than vsync alone, but it will also generally give a better experience than double buffering with vsync when the game is running slower than the refresh rate.

 

Also note that you want to profile an optimized build, which wasn't started with the debugger attached (starting with the debugger attached puts the Windows heap into a relatively slow debug mode).

 

What's the performance like with vsync off?

2

Share this post


Link to post
Share on other sites

This is because D3D drivers are generally lazy and only fully initialize things on first use.

As a side note: other drivers do this as well. Current AMD OpenCL sure does.

Edited by Krohm
1

Share this post


Link to post
Share on other sites

Thanks both, I've done some serious profiling after reading your remarks.

For the number of backbuffers I've sticked with 2 (triple buffering), which I will use in the end anyway.

I did 2 new runs and measured a lot of things and also created some graphs.

The runs were done with and without v-sync enabled.

 

You will find all details below, please let me know your thoughts.

 

Overall summary:

2014-07-31_profiling%20summary.jpg

 

Graph on the run with V-sync:

2014-07-31_run2%20-%201680x1050%20vsync%

 

And 2 graphs on the run without V-sync (splitted into 2x 30.000 frames):

2014-07-31_run1%20-%201680x1050%20no-vsy

2014-07-31_run1%20-%201680x1050%20no-vsy

0

Share this post


Link to post
Share on other sites

With vsync off it looks like your game is generally running fast enough. There's a few spikes where frames are getting close to 16ms which could be explained by my first post. Other than that it's running at about 8ms / frame.

 

I don't trust those CPU performance numbers. The vsync off numbers should have roughly the same CPU usage as with vsync on. My guess is that you're measuring the CPU time spent idle waiting for the GPU in those numbers. That generally happens in the Present() call.

 

With vsync off you have a couple of spikes, which are close to 3 times the normal frame time (~50ms). One possible explanation is that somehow the CPU and GPU were forced to synchronize on those frames. D3D buffers up to 3 frames worth of GPU commands when it's GPU bound, and forcing a synchronization will flush that queue (so the CPU gets blocked for 3 GPU frames). Forcing synchronization generally happens when you lock something that is in use by the GPU (vertex buffer / texture / etc). See http://msdn.microsoft.com/en-gb/library/windows/desktop/bb205132%28v=vs.85%29.aspx#Accessing for a D3D10 based explanation of that.

 

In addition allowing the GPU to get three frames ahead can give you extra lag that you may not want. You can use queries to prevent it getting so far ahead - you want two of them where you issue one and wait for the other on every frame, and then swap over. Note that you do want to allow it to get one frame ahead or you throw away lots of performance because that allows the GPU and CPU to work in parallel (and for SLI systems you should allow it to get two frames ahead). For a quick test Nvidia drivers also have a setting to control it called "maximum pre-rendered frames".

Edited by Adam_42
0

Share this post


Link to post
Share on other sites

Hi.

Thanks, maybe I can find a way to figure out what's exactly happening at those 'peak frames' with v-sync off.

I didn't find a way yet to do a full run in PIX including both CPU/GPU times and all data per frame, that's probably not possible because of the amount of data. Maybe I can set a trigger action that acts like 'save frame date' when frametime is > x.

 

Regarding the CPU times with V-sync on, you're absolutely right. What I did to calculate them is total frame time - GPU time, so in that case CPU times include 'waiting time'.

I'm actually not locking vertex/ index buffers during rendering (only when loading), maybe some D3D calls I use lock the buffer or texture under water, like setstreamsource or settexture. I'll measure a full frame with all d3d calls and see if there are any lock calls.

0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0