Jump to content
  • Advertisement
Sign in to follow this  
oks2024

Is DrawInstancedIndirect slower than DrawInstanced ?

This topic is 1518 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi all!

 

 

I have a lot of instanced geometries that I draw in one draw call, using DrawInstanced.

Now I would like to cull some of them on the GPU. Before starting the culling my first step was to move from DrawInstanced to DrawInstancedIndirect, using the exact same parameters. But when I used the indirect function the timing of the draw went from 0.6 ms to 1.1 ms.

 

At first I was using a compute shader to fill in the parameters of the indirect buffer. As my number of objects is currently always the same, I tried to update it only once, and not every frame, to make sure the overhead wasn't due to the compute shader. Then I used the init data at the creation of the buffer.

In none of these tries the timing changed  in a significant way.

 

I don't know if it matters, but I'm using an AMD R9 290 with the latest drivers, I still need to test it on a NVIDIA card, to check if the results are the same.

 

 

So I'm starting to think that with the same arguments, DrawInstancedIndirect() is slower than DrawInstanced(). Is there any documentation on this? Does someone has already experimented this?

Should I continue to search for something wrong in my code or or consider this as a "normal" behavior?

 

 

Thanks!

 

Share this post


Link to post
Share on other sites
Advertisement

The performance would completely depend on the specifics of how it's implemented in the driver and hardware. There's nothing that I know about common Nvidia or AMD hardware that would make DrawInstancedIndirect significantly slower than DrawInstanced, and I haven't seen any documentation or information to indicate that it would be the case.

 

How exactly are you measuring the timing of your draw call?

Share this post


Link to post
Share on other sites

I'm using the Timer.h and Timer.cpp class that can be found in the AMD samples.

It uses D3D11_QUERY and seems accurate so far, but maybe I should double check that.

Share this post


Link to post
Share on other sites

I just tried on my laptop, on a 630m, it's obviously slower (around 7.5 ms), but the timing is exaclty the same whether I use the Indirect function or not.

I will see if I can try it on other graphics card, but it seem to indicate that the issue does not come from my code, and it's not really a good news.

 

Maybe I should try older drivers to see if it's a new driver issue.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!