Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 22 Jul 2010
Offline Last Active Oct 25 2016 01:29 PM

Posts I've Made

In Topic: Data throughput

13 September 2016 - 01:37 PM

Wow Matias very thorough reply!

Yes 60fps was indeed the target and the fact that you think any of those things could be a problem makes me think more could be achieved.
Some limitations could be out of my control but you gave me a great idea to test the theoretical limits by taking out all the variables I can.

So I will do 2 tests. One with all the data staying on the gpu as a baseline. And a second one uploading all the data each frame but do no calculations on the cpu. I.e. not changing any of the data but just reupload it.

That should give me some good indicators.

I will also use some gou profile to see what commands got issues to help give me a better indication.

Thanks for great reply!

In Topic: Data throughput

13 September 2016 - 09:10 AM

Thanks Silence!

Yes there are many factors at play here but I was just hoping for a "sounds about right" or "sounds a bit low" kind of replies.


That said.

api: DirectX

gpu: Nvidia Quatro k2200

cpu: Intel Xeon E5-2630 @2.5GHz

ram: 32GB


extra info:

assume not much else is going on as I am pretty much just trying to find limits of different aspects of the renderer


really only looking for rough, ballpack, anecdotal kind of answers though, just curious and nothing major is resting on any answer either.

In Topic: render huge amount of objects

15 August 2016 - 07:54 AM

@hodgman thinking about this further, are you suggesting that you only add nodes into that array in reaction to something changing? Infact scratch that. you would also have to add any child nodes too in that case and it wouldn't work if a transform was changed multiple times.


There will always need to be a complete hierarchy pass then I think, I can't see how to avoid it. In which case it still makes sense to just update all transforms 


Some odd cases to think about.

  • Leaf node modified followed by parent followed by its parent... all the way up to the root in that order. 
  • Root node modified (all children will need updating)
  • Leaf node modified followed by its parent's parent alternating all the way to the root.

There are 2 things at play with transforms the way I see it, the local update of a matrix when it is changed... then the re-combining of all the child matrices - this is where I am struggling to see the optimal solution.


Rebuild from scratch? Update and recombine using a 3rd snapshot matrix that represents the hierarchy above? Some other genius idea of justice?



If I get time I might make a 2d test bed to test this, a simple visual 2d tree that is update-able via mouse drags. I can then try various approaches and rather than benchmark I can compare how much work is done/or saved.

In Topic: render huge amount of objects

15 August 2016 - 06:27 AM

aaaaa I clicked on something and lost my essay of a message, I should really install a form saver plugin!!!


@hodgman, interesting and tidy approach but does it end up being more efficient that a normal tree traversal? I guess it depends how much changes from frame to frame, it nothing does then a full tree traversal for transform updating only is pointless. But sorting arrays sounds slow also.


I was aiming for a solution that only touches the minimal set of nodes to respond to a change but also scales well from zero changes to changes in every object in the scene. Me wants cake and eating it!


@poigwym, flags like that should work well I think.

In Topic: render huge amount of objects

15 August 2016 - 04:35 AM

I do it slightly different than some.

Each 3d object has a transform. This has getters/setters for scale/position/rotation.

There is a 32bit dirty flag that is updated through the setters. So any any given time you can know if the scale, rotation or position has been change and in detail too, i.e. which component.

When a matrix is required, the transform is requested to build it, if the dirty flag is non-zero the matrix needs rebuilding. Depending on different flags set it will do it differently. Scales and transforms are very fast, just directly set the values. But If rotations are included then a more complex recompose is done (sin/cos etc..) you can do this a number of ways and look on line for various approaches.

A common one is to build each rotation matrix required and combine it.

Then if the transform has a parent it needs to be concatenated with it's transform too, managing these relationship updates can be tricky and I am still not sold on the best way to do it.


You don't have to do it this way of course you can just operate directly on a matrix appending transformations to it as you wish (would probably be faster).



The view/projection matrix is calculated once per frame and shared across each draw call, only the world matrix is updated in the buffer between calls, so that is just copying 16 floats into the buffer and nothing else - should be pretty quick.


Hope that helps.