Jump to content
Posted 13 October 2012 - 12:48 PM
Posted 13 October 2012 - 03:07 PM
Posted 13 October 2012 - 05:57 PM
It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.
Posted 13 October 2012 - 11:28 PM
Edited by beans222, 13 October 2012 - 11:32 PM.
New C/C++ Build Tool 'Stir' (doesn't just generate Makefiles, it does the build): https://github.com/space222/stir
Posted 14 October 2012 - 07:08 AM
Posted 14 October 2012 - 08:34 AM
1. 4000 x (bindvertexarray(0) + gluseprogram(0)) = un-needed
remove them at the end of the function
3. why are you creating 6 doubles at the start of the function?
they're kinda slow compared to regular float, and unless you can justify their use, replace them with float for the entire program
also, remove them from the function, and use the position values directly
either in the form of vectors, or matrices
heres the thing: the cpu and gpu likes to use its cache
ive been told that the cache is 800 times faster than reading from ram, in that respect, creating temp variables (the doubles) is fast once theyve been created
you are however creating 2 matrices for 4000 objects, that SLOW
and... on top of that, you are rendering ___4000___ objects, without justification!
the gpu likes to render lots of stuff in _one_ call so if there's any way you can remodel a little, and make rendering more composite
ie. gather some objects near each other and make them into 1 draw call, so that you perhaps go from 4000 to 1000, or even better 400 calls
there's lots of options here!
somewhat regretfully, this is the state of the modern day rasterizer
it's powerful in parallell, but each core is a little slow, and its very very bandwidth limited (talk less, render more!)
by talking i mean sending commands to the GPU, or uploading data (if you upload lots of data in one go, that's usually very fast )
Posted 14 October 2012 - 08:40 AM
Everything said is correct. Binding shader program and VAO should be outside of the loop. Binding VBO is meaningless, since it is part of VBO (as beans222 said).
BUT, performance boost is not assured after all this changes since drivers optimize most of the operations. I could firmly claim that NV drivers will not update uniforms if the values are not changed from the previous call. So, dirty-flags and optimized updates are nice but do not necessary lead to higher performance. On the other hand, the number of driver's call certainly can be a bottleneck.
Posted 14 October 2012 - 09:53 AM
Edited by Kaptein, 14 October 2012 - 09:57 AM.