Jump to content
  • Advertisement
Sign in to follow this  
Mr_Fox

ClearUnorderedAccessViewUint is slower than handcrafted shader, and does that works for all format?

This topic is 629 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey Guys,

 

I have looked through the MSDN page about this function, and replaced my reset compute shader with this function. It did it job, and simplified my code by few lines. However for clearing a 192^3 R8 volume it takes 0.08ms, which only take 0.05ms if I use my compute shader.

 

So here comes my questions: what's the advantage of using this function over your own compute shader? Also it seems this function can handle all possible format with at most 4 32bit int clear value, what will happen if I use four 65535 as clear value to clean a R8 INT volume?

 

Thanks 

Share this post


Link to post
Share on other sites
Advertisement

The main benefit is (as you mentioned) simplification of the code.

this functions wraps all possible options (does the UAV references a texture or buffer?), all possible formats and some possible dark corners (for example feature level 10.1). 

 

About a clear value of 65535, I'm assuming that it the values will be clamped to 255 as clamping is default gpu behaviour

Edited by Yourself

Share this post


Link to post
Share on other sites

Thanks 

 

So I guess hand crafted compute shader for copying resource around will also be faster than CopyResource APIs. But it doesn't make sense to me why API calls will be slower than hand crafted compute shader since API calls should be crazily optimized  

Share this post


Link to post
Share on other sites

There's a few spots where APIs have built in features that don't need to be built in, as they're possible by using other API features. One example is mip-map generation -- many API's offer a single function call for this, but you can also do it yourself by rendering to each mip map level. Usually this stuff should belong in a utility library instead of the core API.

 

As for this particular case, it's pretty much just a utility function. I guess it's in there in case one of the GPU vendors is able to clear memory in a way other than writing to it with CS... In my experience so far though, the generic way to clear a block of memory is to write to it using a CS, so this is probably what the driver is doing under the hood.

 

As for your timing difference -- 50µs vs 80µs -- it's pretty much the same. I don't really trust GPU timing measurements that are less than around a dozen microseconds :wink:

If that difference remains when scaling up -- e.g. when clearing a larger block of memory, one method takes 5ms whereas the other takes 8ms, then I would certainly believe there is a difference in performance of 8/5=1.6x... but it's possible that there's also a performance difference of 20µs overehad, so in the large scale test the result would be 5ms vs 5.02ms (1.004x difference).

 

Benchmarking this stuff is also hard because the actual commands that the GPU has to execute include the dispatch/compute-shader execution, but then also include cache flushing, cache invalidation, and pipeline stalling. The cost of these extra operation can highly depend on what kind of dispatch/draw command follows your shader.

 

Back to mipmap generation -- In theory that should be part of the API so that each vendor can implement it in the fastest way possible for their hardware... but in my experience it's also just there as a helper/utility function, and that it's possible to implement your own versions of it that are faster than the driver.

Edited by Hodgman

Share this post


Link to post
Share on other sites

It's possible that the driver is using a DMA unit to fill the buffer with the clear value. This is probably slower than using a compute shader, but it has the advantage of leaving the shader cores free to do other work in parallel.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!