• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
Anand Baumunk

Swapchain performance

10 posts in this topic

Hey,
After a lot of examining performance issues with my directx11-application, I came to a conclusion (could have been come earlier to this conclusion, but anyway). After I redused the app just to the "outer" directx, I found what is slowing down everything so much:
[CODE]My SwapChain->Present(0,0)![/CODE]
When calling it without drawing anything I already drop down to 600-700 fps. Then, when using
[CODE]ClearRenderTargetView(renderTargetView, bgColor); [/CODE]
and
[CODE]ClearDepthStencilView(depthStencilView, D3D11_CLEAR_DEPTH|D3D11_CLEAR_STENCIL, 1.0f, 0);[/CODE]
the FPS are already down to 300-350.

Now I would like to know: Is this right? I got the code for the swapchain and the creation of the device from a tutorial, so I can't really say what is up with it. Those 3 lines have to be called to render properly anyway, and my GPU is pretty good ( it can run BF3 and stuff np ).
So I'm suspicious about this code for the swapchain. Before I post all of this, can you confirm that 300 on this 3 lines is way to little?
(Without those 3 lines I do have 250,000)

Thank you Edited by gnomgrol
0

Share this post


Link to post
Share on other sites
This kind of behaviour is quite normal; it's better to measure in milliseconds per frame rather than frames per second, at which point you'll see that - even though it looks as if your clears wipe out half of your performance - they're actually taking quite a small amount of additional time. (Note that if you're overdrawing the entire color buffer per frame anyway then you should be able to get away without clearing it, which will shave back a bit of perf.)

However, for a GPU that can run BF3 your overall performance seems shockingly low and points at something else being a cause of trouble. You should be getting framerates in the thousands here (for reference, I just tested similar on a low/mid AMD mobile GPU and easily cleared 2000fps at 1024x768 windowed). Have a look over your timer code to make sure there's nothing odd there, and also make double-sure that you're not calling Sleep anywhere. And there may be something evil in your game loop; e.g. GetMessage instead of PeekMessage, or bad PeekMessage handling.

Also worth enabling the debug runtimes and checking if you've got anything that's troublesome there, as well as capturing calls for a frame in PIX and having a look at what's happening from that point of view (be aware that a PIX call capture may make the first D3D call each frame seem inordinately longer than it really is - that's just an artefact of the sampling/capture mechanism and not an indicator of anything bad). Finally, make absolutely certain that you're not creating/destroying any D3D resources each frame - even if you don't otherwise use them, this is a very expensive operation and cration/destruction should be moved to runtime. Edited by mhagain
2

Share this post


Link to post
Share on other sites
Thanks for your answer.
I could get rid of the clearing, but as you pointed out that is not the biggest problem here.
I assume that my timers are fine overall, when I'm calling only them FPS are on 300,000. I'm not creating any resources, all that is happening are those 3 lines and the timers.
I render in a 1920x1080 window, I want the game to be at this resolution afterwards anyway. (on 1024x768 I get 500fps)
When I draw 256*256*9 vertices in a trianglelist I am already down to 10-20 FPS.
Maybe my decribtion of the backbuffer or the depthstencilview etc. is bad, causing to loose so much performace when presenting the scene to it?
If you know a good tutorial on initialising directx11, I would apprechiate it. Edited by gnomgrol
0

Share this post


Link to post
Share on other sites
Here you go:

[CODE]

//Describe our Buffer
DXGI_MODE_DESC bufferDesc;
ZeroMemory(&bufferDesc, sizeof(DXGI_MODE_DESC));
bufferDesc.Width = 1024;
bufferDesc.Height = 768;
bufferDesc.RefreshRate.Numerator = 60;
bufferDesc.RefreshRate.Denominator = 1;
bufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
bufferDesc.ScanlineOrdering = DXGI_MODE_SCANLINE_ORDER_UNSPECIFIED;
bufferDesc.Scaling = DXGI_MODE_SCALING_UNSPECIFIED;

//Describe our SwapChain
DXGI_SWAP_CHAIN_DESC swapChainDesc;

ZeroMemory(&swapChainDesc, sizeof(DXGI_SWAP_CHAIN_DESC));
swapChainDesc.BufferDesc = bufferDesc;
swapChainDesc.SampleDesc.Count = 1; //antialiasing
swapChainDesc.SampleDesc.Quality = 0;
swapChainDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT;
swapChainDesc.BufferCount = 1;
swapChainDesc.OutputWindow = hWnd;
swapChainDesc.Windowed = TRUE;
swapChainDesc.SwapEffect = DXGI_SWAP_EFFECT_DISCARD;
//Create our SwapChain D3D11_CREATE_DEVICE_DEBUG
hr = D3D11CreateDeviceAndSwapChain(NULL, D3D_DRIVER_TYPE_HARDWARE, NULL, NULL, NULL, NULL,D3D11_SDK_VERSION, &swapChainDesc, &SwapChain, &d3d11Device, NULL, &d3d11DevCon);

//Create our BackBuffer
ID3D11Texture2D* BackBuffer;
hr = SwapChain->GetBuffer( 0, __uuidof( ID3D11Texture2D ), (void**)&BackBuffer );

//Create our Render Target
hr = d3d11Device->CreateRenderTargetView( BackBuffer, NULL, &renderTargetView );
BackBuffer->Release();

//And:

D3D11_TEXTURE2D_DESC depthStencilDesc;
depthStencilDesc.Width = 1024;
depthStencilDesc.Height = 768;
depthStencilDesc.MipLevels = 1;
depthStencilDesc.ArraySize = 1;
depthStencilDesc.Format = DXGI_FORMAT_D24_UNORM_S8_UINT;
depthStencilDesc.SampleDesc.Count = 1;
depthStencilDesc.SampleDesc.Quality = 0;
depthStencilDesc.Usage = D3D11_USAGE_DEFAULT;
depthStencilDesc.BindFlags = D3D11_BIND_DEPTH_STENCIL;
depthStencilDesc.CPUAccessFlags = 0;
depthStencilDesc.MiscFlags = 0;
//Create the Depth/Stencil View
d3d11Device->CreateTexture2D(&depthStencilDesc, NULL, &depthStencilBuffer);
d3d11Device->CreateDepthStencilView(depthStencilBuffer, NULL, &depthStencilView);

//Set our Render Target
d3d11DevCon->OMSetRenderTargets( 1, &renderTargetView, depthStencilView );


//Create the Viewport
D3D11_VIEWPORT viewport;
ZeroMemory(&viewport, sizeof(D3D11_VIEWPORT));
viewport.TopLeftX = 0;
viewport.TopLeftY = 0;
viewport.Width = 1024;
viewport.Height = 768;
viewport.MinDepth = 0.0f;
viewport.MaxDepth = 1.0f;
//Set the Viewport
d3d11DevCon->RSSetViewports(1, &viewport);

[/CODE]
0

Share this post


Link to post
Share on other sites
Is your bufferDesc coming from an enumerated mode or did you just put the values in yourself? Have a look at http://msdn.microsoft.com/en-us/library/windows/desktop/bb205075%28v=vs.85%29.aspx - with particular attention to the section with the "Full-Screen Performance Tip" header.
0

Share this post


Link to post
Share on other sites
I got the code from a tutorial, all the code was like the one you can see here (ofc I changed windowssize). I'll look into this, but I have to say I find this msdnpages often hard to understand. Edited by gnomgrol
0

Share this post


Link to post
Share on other sites
[quote name='mhagain' timestamp='1344980138' post='4969603']
it's better to measure in milliseconds per frame rather than frames per second
[/quote]Quote for emphasis.
Converting your figures:
Presenting without drawing = 600fps == 1.6ms
Presenting with clearing = 350fps == 2.86ms
Difference == 1.26ms

A 1920x1080 window occupies around at least 8MiB of RAM -- taking 1ms to write 8MiB of data is pretty good, that's about 8GiB/s bandwidth.
This does not seem like you're experiencing a performance problem here.

Don't measure performance with FPS.[quote name='gnomgrol' timestamp='1345014839' post='4969759']
When I draw 256*256*9 vertices in a trianglelist I am already down to 10-20 FPS.
[/quote]There's not enough information to draw any conclusions from that statement. For all I know, that's two-hundred-thousand triangles which all overlap each other and cover the entire screen ([i]which would generate 1.5 terrabytes of pixel output[/i]). If this triangle-list is causing performance problems, it's a separate issue to the above cost of clearing. You'll have to perform a series of experiments to see what the slow-down is ([i]e.g. slow vertex processing, slow pixel shader, etc[/i])...
0

Share this post


Link to post
Share on other sites
Thanks for your answer. But like mhAgain mentioned, I should have a much better performace on my GPU when just clearing and presenting the window.
What I can tell you is this: I draw the trianglelist 9 times instanced, but instancing it does cost me nearly the same performance as calling draw 9 times. Could that be a hint on a slow pixelshader? When the whole screen is covered with the triangles, I loose the performance, when I move the cam so I just look at blank background, the performace rises a lot! Edited by gnomgrol
0

Share this post


Link to post
Share on other sites
[quote name='gnomgrol' timestamp='1345106737' post='4970094']I should have a much better performace on my GPU when just clearing and presenting the window.[/quote]How do you know - what GPU is it? What's it's theoretical memory bandwidth? How long in theory should it take to transfer 16MiB of data?
[quote]I draw the trianglelist 9 times instanced, but instancing it does cost me nearly the same performance as calling draw 9 times[/quote]Reducing draw-calls via instancing is an optimisation to reduce CPU-side overhead. You are almost certainly GPU-bound, so this is no surprise.[quote]When the whole screen is covered with the triangles, I loose the performance, when I move the cam so I just look at blank background, the performace rises a lot![/quote]This is a definite hint that the bottleneck in your program is pixel-processing. Your pixel shader could be too complex ([i]too many instructions, too many texture fetches)[/i], your model could have too much over-draw ([i]triangles appearing over the top of each other, causing many triangles to calculate pixel values that are overwritten[/i]), or you could be generating too much ROP throughput ([i]e.g. blending lots of pixels into a frame-buffer, or using an expensive framebuffer format[/i]).
0

Share this post


Link to post
Share on other sites
As I mentioned aboth, my GPU can run BF3 and other performanceinstensive games with out any problems, so that really shouldnt be the problem.
I reduced my CPU side stuff because I was certain that the problem lied there. My pixelshader is fairly simple, and there are not THAT much pixel beeing processed multiple times.
I can't say that I have a clue what the 3rd thing you mentioned is, but I will go ahead and research.


I noticed something wierd. When I draw 9x 256*256 instanced without height, shadows and normals passed to shader, the performance sinks down to 10 FPS/ 100ms,
when I draw 36x 256*256 with a normal drawcall for each (and normals etc. send to the shader), I still remain with 50FPS/20ms. By changing the pixelshader to return float4(1.0f...); I don't get a performaceincrease.

I came up with another question. When the window in which direct3d is drawing has not the focus, which influence has this on the performance? I noticed that in many games the FPS go down to 10 if I focus another window.
I got a console for my application, so I can output things for debugging etc. Could this have influence on the performance? Edited by gnomgrol
0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0