Jump to content
  • Advertisement
Sign in to follow this  
MarkS

Calculating matrices and then sending them or sending them and letting the GPU do the calculating?

This topic is 2161 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I was looking at some GL 3.3 code and I noticed that the person that wrote it sent a single MVP matrix to the shader instead of a model view and projection matrix separately. I found this kind of odd. The GPU is better suited at doing these kind of calculations (matrix multiplication, in this case) quickly. The only thing I can think of is that sending 32 floats instead of 16 is slower than the time it would take to complete the matrix multiplication and send the result. Is this the case?

Share this post


Link to post
Share on other sites
Advertisement
The question is not whether it is faster to calculate the final MVP matrix on the CPU or the GPU, but whether it is faster to calculate it matrix once per model on the CPU or once per vertex on the GPU. But maybe the compiler could, theoretically, precompute the product just before the program is executed, but who knows.

 

 

Once per VERTEX?blink.png OK then. Question answered...

Share this post


Link to post
Share on other sites
I was looking at some GL 3.3 code and I noticed that the person that wrote it sent a single MVP matrix to the shader instead of a model view and projection matrix separately. I found this kind of odd.

It is actually quite standard/common.
Firstly, there is overhead in sending data to shaders as uniforms. Not only does the amount that should be sent need to be kept to a minimum, actual updating of things should be reserved to those uniforms that have actually changed.

Ignoring the dirty flag for each matrix since they will likely be dirty just as frequently either way, updating 3 uniforms instead of only 1 is already likely to be slower itself than doing the matrix math on the CPU.

 

So CPU side is already either winning or fairly close.  Then if you have 3 uniforms, instead of 1, the GPU falls behind in performance for every single vertex you have (where, for each, 3 matrix multiples will be done instead of 1).

It is true that you sometimes need to upload world and view matrices separately anyway, but while that still leaves the bandwidth performance the same, the GPU would fall behind by that much more if it was doing the task of combining any of those matrices itself for each vertex.

 

 

In general, there are almost (or are no) situations in which it is a winning move to combine matrices on the GPU for each vertex rather than once on the CPU side.  If a single matrix multiply/upload on the CPU side can replace 2 matrix shader uploads followed by more efficient vertex shaders, it is always the way to go.

 

 

L. Spiro

Share this post


Link to post
Share on other sites

Thank you both. I wasn't thinking this was being done per vertex, but per model.

 

Brain fart. Pardon the stink! sleep.png

Edited by MarkS

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!