Sorting/structuring renderables and cache locality

Started by
13 comments, last by Lemmi 9 years, 7 months ago
That assumes one draw call per object, plus by the time you've got to the 'sort draw calls' you'll have already done a lot of dead object removal so you should never see a 'dead' object in your draw call lists to sort by.

At the highest 'game' level you'd be tracking the game entity which any attached renderables (1 or more draw calls) are associated; when these die the renderer never sees them.

Vis-culling per "camera", again above renderer submission, takes care of visible objects for a given scene.

Only once you get beyond vis-culling do you start breaking renderables down into their draw-call components and start sorting them and routing them to the correct passes for a scene.
Advertisement

note that sort order ought to be based on binding times.

slowest to fastest these appear to be (someone correct me if i'm wrong):

1. texture

2. mesh

3 material

4. constants

5. transforms

6. other flags

so for no alpha blend, sort on tex, then mesh, then material, then constants (perhaps?), then near to far, then draw you instances with your various transforms, setting other flags on the fly as needed (using a state manager, of course).

for alpha blend its the same, but sort far to near.

i personally don't sort on constants, as i'm not using shader code in my current project. one of the shader coders here can tell you if its worth it or not. its may depend on shader model used, as i recall, constant bind times were slow in some early shader models. but i defer to those here with more shader experience who may be able to elaborate on that point...

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

Okay! I had no internet access for about a week (feels like a year). This is all super good advice and I will definitely take all of these into consideration when moving forward. However, it just occurred to me, I'm unsure of how I want to produce and store scene depth during the pre-render transform pass?

I'm quite sure I can't just use the normal transforms as they are, because they're probably in world-space or own-space.

Should I just as an extra step transform everything by the camera's view matrix to produce the depth from the camera's point of view and then store that and use that for sorting?

My intent is to store my depth buffer linearly, the way MJP describes in his excellent tutorials.

See, last time I did something like this, I did no sorting at all, and I just did all transforms on the vertex shader, where I'd multiply every vertex by WVP or whatever was needed. I did no sorting at all, I just let the shader sort out depth through painter's algorithm.

To roughly sketch out what I'm imagining here:

Game's pre-render update pass:

for(each renderable)

{

//EITHER THIS VVVVVV

//transform renderable to camera viewspace and get the depth from camera POV

float renderableDepth = (renderable.transform * camera.viewMatrix).z

//OR THIS VVVVV

//simply calculate a rough distance between camera and renderable. when we've reached this point, we've already culled away all the objects that are outside the camera frustum, //so it should be pretty OK?

float renderableDepth = vector3Distance(renderable.position, camera.position) //returns a length as a single float

//no matter the method, we'd finally do this

//encode the depth somewhere within the flags variable

(renderable.flags & 0x0000ffff) |= renderableDepth; //Don't pay too much attention to how I pack it. I can never remember bitshifting syntax without looking it up.

}

Then later on:

void Sort(all the renderables)

{

for(each renderable)

{

Sort based on... flag? just straight up sort on which value is lowest?

and that the lowest value also indicates the first textures/materials/meshes.. Perhaps first compare it on textures, then meshes etc, as was suggested above?

}

for(each renderable)

{

and then after it's been sorted once by the first 32bits of the flag (where we'd possibly store all those things)

we sort it again by depth?

}

}

I'm sorry for being so dense. ;)

I am also of course aware that I'll have to profile this to see if I even gain anything at all by sorting, but I sort of just want to try making a system of this sort either way, as I think it could be very useful to understand the techniques.

For distance you'll want the second version, however instead of working out the distance stick with the squared distance as it is cheaper to calculate as it doesn't need the square root operation, and does the same job.

For distance you'll want the second version, however instead of working out the distance stick with the squared distance as it is cheaper to calculate as it doesn't need the square root operation, and does the same job.

You're right and I agree! I'll also go ahead and assume that my thoughts surrounding the sorting approach are at least somewhat on the right track. I'll carefully re-read all the posts before acting.

This topic is closed to new replies.

Advertisement