Jump to content

  • Log In with Google      Sign In   
  • Create Account


is this an optimization or an antioptimization?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
1 reply to this topic

#1 JohnnyCode   Members   -  Reputation: 96

Like
1Likes
Like

Posted 22 December 2013 - 12:36 PM

I have a technical dilema that needs deeper GPU functioning knowledge to decide wheather it will do good, or , do worse!

I have 2 framebuffer objects, with two distinct render buffer objects attached to them, and two distinct textures attached to them.

I perform this operation in a render cycle:

 

-bind frame buffer A

- drawElements

-bind frame buffer B

- drawElements

 

In case the frame buffers A and B would have attached the very same render buffer object, would it couse stalking? since the subsequent drawElements would wish to write into same depth/stencil attachment that belongs to shared render buffer. I have heard that GPUs are slow at memory reading and writing, yet insanely fast at performing operations. So if GPU SIMD comes to needing data, it waits, leaving the core computation power to SIMDs that do not wait for memory, and gets core returned when it recieves the data.

i have tried to perform the scenario, but FPS does not seem to differ, in case of 2 distinct depth/stencil buffers, and in case of shared depth/stencil. But , if I do this in the case of shared depth/stencil:

-bind frame buffer A

- drawElements

-bind frame buffer B

- disable depth write

- set z function to equal

- drawElements

- enable depth write

- set z function to lessequal

 

FPS drops by index of 2.0

 

so should I use distinct render buffer objects attached to those frame buffers, even if it means double depth rendering? I am profiling on GPU with slow DDR3 1333 RAM, but with 120 0.5 GHz stream processors.

 



Sponsor:

#2 samoth   Crossbones+   -  Reputation: 4068

Like
2Likes
Like

Posted 22 December 2013 - 02:20 PM

You kind of answered your question yourself, the fps drops to one half. So it is an anti-optimization.

 

Generally, more state changes are worse than fewer, but changing the direction or function of the z test is almost always a very serious anti-optimization. It is not only a state change that causes the driver to do a lot of re-evaluations and optimizations, but it also defies hierarchical z.






Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS