Jump to content

  • Log In with Google      Sign In   
  • Create Account

We need your help!

We need 7 developers from Canada and 18 more from Australia to help us complete a research survey.

Support our site by taking a quick sponsored survey and win a chance at a $50 Amazon gift card. Click here to get started!

Instancing render strategy: altering renderbuffer or throwing all at card?

Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
4 replies to this topic

#1 QNAN   Members   -  Reputation: 223


Posted 21 November 2012 - 01:37 PM

I am currently working on instancing and I have a problem with sorting the rendering.

If I choose to render only the visible instances, I have to open the instancing render buffer and upload the instancing data for the visible instances. The problem is just, that for each shadowmap this has to be done again, then for reflections, etc. A hell of a lot of locking the render buffer, uploading data and then rendering - every single frame.

I thought about it, and wondered if it is even worth it. Can I not just throw everything at the card? What kind of tradeoffs would each strategy entail?
I recon, that it also depends on the number of instances that are not visible, but what amounts are we talking about before it is no longer feasible to not optimize?

I have an object oriented system, that the rendering works in, and if I have to sort, then we are also talking a number of virtual calls before the hardware-specific call (DirectX 9 in my case) is reached, as the platform-specificness has been abstracted away. These calls (calls to 3 virtual-hierarchies for each setting of renderbuffer, not too much) may also be a factor in evaluating the feasibility, though Im unsure if the factor is so small, that it can be ignored - it is every single frame, after all.

Edited by QNAN, 21 November 2012 - 02:25 PM.


#2 hupsilardee   Members   -  Reputation: 491


Posted 21 November 2012 - 06:54 PM

You could use multiple instancing buffers. One for each light for the shadow map generation, one for the scene rendering. Then you can lock them ALL, iterate through the scene and add all instances to their respective buffers as required, then unlock all and start using them to generate shadow maps and render the scene. By doing the lock/unlocking concurrently (well, interleaved), you reduce the amount of time the CPU has to wait for the buffers to be free

compare the following two strategies, given that you have 2 shadow mapped lights (makes 3 instance buffers used in total)

lock | iterate scene/copy instance data | unlock | render shadow map 1 | lock | iterate scene | unlock | render shadow map 2 | lock | iterate scene | unlock | render scene

lock x3 | iterate scene/copy instance data | unlock x3 | render shadow maps 1-n | render scene

(I assume that by the instance buffer you mean the vertex buffer that is bound to slot 1 and contains a Matrix per instance, or float3(pos)+float(uniform scale)+float(quat rotation) if you're really fancy)

#3 QNAN   Members   -  Reputation: 223


Posted 22 November 2012 - 03:11 AM

That is a very nice idea Husilardee.
The biggest problem I see is, that in an openended game, you will not know ahead how many shadows/reflections/cameras/etc. are in the vicinity of the rendered area. I guess you could create an amount of instance buffers (yes, I mean vertexbuffer with instance data) equal to a maximum possible, that you specify...

Another problem is that I would then have to tell my object how many cameras it is observed by, where it is at the moment completely oblivious of this (and logically should be IMO).

I like the idea though, and I will see if I can incorporate it in some way - I will have to do some thinking to get that into my system :)

Do you have any bid on what the cost would be if I just brute-force, throwing all instances at the card, vs. sorting? How many wasted instances (or triangles) are we talking before it is no longer feasible to not optimize?

#4 hupsilardee   Members   -  Reputation: 491


Posted 22 November 2012 - 09:26 AM

I really really don't recommend throwing everything at the card. I might point out that in any game scenario there's a limit to however many shadow maps you're going to be using in one frame anyway, because the expense of generating more than 4 shadow maps per frame, then over 4 texture comparisons when shading the scene, becomes prohibitive.

I would really create as many instance buffers as you need. Let's say you want 30 different meshes to be instance-able. (5 tree variations, 10 rock variations, 5 grass variations, 10 floor clutter variations, random props eg crates, barrels, etc etc).
Each instance requires one float3x3 for a rotation*scale, float3 for a position, float3 for a diffuse color, and 2 more arbitrary float parameters which could be used for different things, makes 16 floats which fits nicely). The total amount of data is 64 bytes per instance
You want 1000 maximum instances in the level, and you limit the number of active shadow-mappable lights to 4. This means each mesh needs 5 instance buffers. So that's
30x5x1000x64 bytes = about 9 mb of VRAM. Not a huge amount, and will be eclipsed by the shadow maps themselves. (4x1024x1024x32bpp = 16 mb)

Objects don't need to know how many cameras they are visible to. The scene can do frustum culling for all objects and lights at the same time, and write the instance buffers as necessary.

Let's take a good example from the game du jour - Slender. If you haven't played it, it's a horror game where the player walks around in first person view in a forest at night with a torch. Now obviously the torch is represented as a spot-light, which has an associated shadow map. (I think it's the only light source in the game as well as a very small amount of ambient light). Now let's say there are 100 trees in the game level. About 20 are visible to the player at any time, and 5 fall within the torch's beam. You tell me how much of a waste it is to throw everything at the card.

This is the approach I'm using anyway.

#5 QNAN   Members   -  Reputation: 223


Posted 22 November 2012 - 12:30 PM

Ok, thanks for the input.
I have some new thoughts to work with.

Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.