Jump to content
  • Advertisement
Sign in to follow this  
Aqua Costa

GPU bottlenecks and Sync Points

This topic is 1589 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi,

 

After reading a few presentations from past GDC about DX performance I'm a little confused:

 

1 - (From GDC 2012 slide 44) How is it possible to be vertex shading limited? Aren't ALU units shared between shader stages (in D3D11 hardware anyway)? So there shouldn't be any hardware resources waiting for the vertex shader to finish...

 

2 - Regarding CPU-GPU sync points, currently my engine uses the same buffer to draw almost every object (so it's Map()/Unmap() using DISCARD hundreds or thousands of times per frame, every frame, the same cbuffer). Is this crazy unsure.png ? Most samples do it this way, but they're samples...

Anyway I'll add an option in debug builds to detect sync points like suggested in the presentation.

 

3 - "Buffer Rename operation (MAP_DISCARD) after deallocation" (slide 9 from 1st link above) - What are these rename operations?

 

Thanks.

Share this post


Link to post
Share on other sites
Advertisement

1- If you have meshes with a significant amount of vertices ( these will need to be transformed and whatever other per-vertex operation needs to be performed ) and simple fragment shaders then your bottleneck is going to be vertex process..and thats just a simplification. Being vertex bound is wholly a matter of ALU being shared, but seeing that ALU resources in not infinite, then its theoretically possible to swamp the HW with so much work that it becomes an issue.

2. Unless you can draw using a buffer that is already mapped, then mapping/umapping one monolithic buffer to update sections of it doesn't seems ideal( If I'm understanding your description correctly ). Using multiple buffers based on update frequency may allow you overlap drawing while updating...

3. Renaming is just something the driver does internally. So if you have a handle to a resource and its 'renamed' , then the previous handle to resource becomes invalid.

Share this post


Link to post
Share on other sites

1. The output of the vertex shader is going to feed into fixed-function rasterization hardware, which will then spawn pixel shaders. If the rate at which you are shading vertices is slower than the rate at which the hardware can assemble/rasterize primitives, then you're being bound by your vertex shader performance.

2 and 3. Doing this just puts the responsibility entirely on the driver to avoid sync points by using buffer renaming behind the scenes. Unfortunately the specifics of this are not  fully documented to programmers since it relies on details of the D3D driver layer as well as the IHV's actual driver implementation, but in general the renaming amounts to the driver allocating a chunk of memory from some internal ring buffer every time you call Map with DISCARD. The driver then has to track when the GPU actually consumes the data, so that it can release it. In general doing MAP lots of times for smaller allocations (like constant buffers) probably isn't going to be a big deal, or at least it hasn't been in my experience. The places where I've seen people run into issues has been for larger buffers used for vertex data and things like that. If I recall correctly Starcraft 2 did some crazy allocation scheme on their end to avoid having to put too much pressure on the driver. 

Share this post


Link to post
Share on other sites

Discard/rename/orphan are all different names for the same thing.

You've got a resource handle/ID/pointer in your application, that represents some bit of GPU memory.
Normally if you try to map a resource that's in use by the GPU, you'd have to stall until it's finished.
When you map-discard that resource while the GPU is still using that bit of GPU memory, then the driver internally orphans the old memory (marking it as discarded garbage to be cleaned up when the GPU is finished with it), allocates you some new memory, and internally makes your resource handle/ID/pointer now point to that new memory (renames it).

AFAIK, this is a very common strategy for management of cbuffers in D3D11 -- the drivers are smart enough to allocate all that memory and clean up your garbage.
From what I've read elsewhere on this board though, GL drivers don't work too well using the same strategy -- apparently it's better to allocate a large buffer yourself and manually manage sub-allocations within it, or make as many different resources as you'll need in two/three frames, and cycle through them (manual double-buffering).

 

FWIW, writing a discard/rename/orphan system is actually very complex, especially if you want it to just work in the general case... Many of the graphics APIs used by game consoles do not implement this themselves, so it's up to the game engine author to implement map-discard/renaming/orphaning... Some of the engines that I've seen require you to give more hints when you create a device -- such as "I will map this at most twice within every two frames", or "I will map this n times per frame", etc...

AFAIK, Mantle and D3D12 are also dropping support for map-discard, so that it will be the responsibility of the application, rather than the driver.

 

I couldn't see anything about being VS-bound on slide 44... [edit] ah, they mention it's not common to be VS limited -- this can also mean that the number of vertices passing through the VS doesn't have much impact on your framerate, simply because it's such a small ratio compared to the pixel shading cost [/edit]

But as MJP says, there's still parallel work - fetching VS attributes, doing VS ALU work, and FF rasterization/clipping/etc might all be occuring in parallel -- if the ALU time is on the critical path, you'd say you were vertex shader bound (as opposed to fetch bound or raster bound, or ROP/OM/export bound).
Also, 360/PS3 still dedicate resources specifically to PS or VS, and they're still major target platforms for new games ;-)

Edited by Hodgman

Share this post


Link to post
Share on other sites

AFAIK, this is a very common strategy for management of cbuffers in D3D11 -- the drivers are smart enough to allocate all that memory and clean up your garbage.


Doesn't stop the IHV hating you for doing it too much/often ;)

NV - limited number of rename-slots they can allocate from but buffer size isn't a problem
AMD - a buffer has an 8meg rename buffer attached to it, so as soon as your updates on a buffer go over that amount welcome to Slow City.

Both companies will, of course, write code paths for games to get around the problems if they are big enough (AMD had to double the buffer for a game at the last place I worked as they were doing too many discards on a buffer during a frame).

Constantly cycling on one buffer is basically bad voodoo; it'll work but you run the risk of the IHV Ninjas murdering you in your sleep ;)

Share this post


Link to post
Share on other sites

Explains why changing from a single buffer for all or a single for each doesnt seem to change anything, nor does changing updating with map or updatesubresource, I never noticed a difference. Nor changing the number of back buffers, setting to 1 or 8, still no changes. Is this bad or good? I should test with other vid cards that arent nVidia to see what happens.

 

At first I though using one for all would be a huge mem save ;D, and since it didnt do any improvement switching to one for each, I kept with it, till read those exactly slides on update strategies. I did some quick tests switching between strategies and not a single difference, so I kept one for each and map strategies.. It does seem obvious that one for all would cause massive sync times, but they dont really happen.

 

Its like noting you do app wise matters, the driver ignores you u_u.

Edited by Icebone1000

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!