Render Queue Design

Started by
12 comments, last by agleed 9 years ago
Hi.

I'm trying to understand in detail how to program a render queue, in
particular the way L. Spiro has explained in several posts.

This is what I think she has said:

Each camera and light that cast shadows has a render queue set.

A render queue set is composed of two render queues, one for opaque
submeshes and one for alpha submeshes:

class Render_queue_set {
        Render_queue opaque_queue;
        Render_queue alpha_queue;
};
A render queue is a sequence (array, vector, etc.) of render items.

class Render_item {
        Model *model; // Not sure
        Mesh *mesh; // Not sure
        Submesh *submesh; // Not sure
        Shader_id shader_id;
        int num_textures;
        Texture_id textures[MAX_TEXTURE_UNITS];
        float depth; // Distance from model (or mesh?) to the camera
};
Render procedure for each viewpoint:
- The Scene Manager collects models than are in view (frustrum culling,
occlusion culling, etc.).
- Tell each collected model to push items to the render queue set.
- Sort the render queues.
- Sort only indices.
- Take advantage of per-frame temporal coherence.
- For each render item in the render queues (first opaque and second
alpha) tell the mesh (or model?) to render the item.

I'm not sure if a render item of the L. Spiro design needs pointers to model,
mesh and/or submesh. In my current engine, I would need the three pointers because:
- The model has the matrices I need to set uniforms.
- The mesh has the vertex buffer (I share the same vertex buffers for all
submeshes) and currently a submesh doesn't have a pointer to his mesh.
- Each submesh has the index buffer and the material.

http://www.bytenjoy.com | [twitter]Bytenjoy[/twitter]

Advertisement
I think you're in the right direction.
To make this really beneficial you can make a "key" (bitkey) for each renderable, so you can use bitwise sorting to sort anyway you like. For example on depth (far to near for non-opaque objects). In the end I think the definition of an object in this case is a renderable/subobject (set of vertices/polys which share a transformation and material), because these properties you need for sorting. For any "higher level" with more transforms, materials etc., the sorting would be less useful.

Here's a nice topic on the bitkeys and sorting:
http://www.gamedev.net/topic/659607-sorting-a-bucket/

Reading your question I'd say you got the basics to start off and see how it goes along the way

Crealysm game & engine development: http://www.crealysm.com

Looking for a passionate, disciplined and structured producer? PM me

The model has the matrices I need to set uniforms.

A model has only a single transform. That transform can’t be used to place wheels in the correct locations or perform skinning.
Each mesh is a modular unit and needs to have its own transform. A model is really nothing but a container for meshes and a single primary world position. From there, each mesh has a local transform and a world transform based on each parent actor. The model hierarchy takes advantage of the parent/child relationship (the scene graph) already present in the actor class.
There is nothing on a model that needs to be set in shaders.

The mesh has the vertex buffer (I share the same vertex buffers for all
submeshes) and currently a submesh doesn't have a pointer to his mesh.

While there isn’t necessarily a correct answer here, I allow sub-meshes to have pointers to their mesh owners. This makes many things easier and allows you to reduce the size of your render-queue key. It helps in other areas too, such as when you want to pass just a sub-mesh to a function without needing to also pass the mesh that owns it.


The rest is similar to what I do. You differ in areas that are not necessarily right or wrong, but just for the sake of explanation, the meshes draw themselves in my implementation.
A mesh submits itself to the render-queue once per sub-mesh, as well as an index to the sub-mesh. Since there will likely never be over 65,535 sub-meshes on a mesh this can save you 16 (on 32-bit systems) or 48 (on 64-bit machines) bits over using a pointer.
For a mesh to draw itself it implements a pure virtual function which accepts the index to the sub-mesh to draw and any flags the mesh wanted to be passed back to it for the render (among other things at your discretion).

This would allow you to continue omitting the sub-mesh’s pointer to its owning mesh if you still prefer to do that.


I am currently right at the point in my new engine where I am just beginning to re-implement render-queues and plan to write an article on it as I go along.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

Thanks for your responses.

A model has only a single transform. That transform can’t be used to place wheels in the correct locations or perform skinning.
Each mesh is a modular unit and needs to have its own transform. A model is really nothing but a container for meshes and a single primary world position. From there, each mesh has a local transform and a world transform based on each parent actor. The model hierarchy takes advantage of the parent/child relationship (the scene graph) already present in the actor class.
There is nothing on a model that needs to be set in shaders.


Do you calculate the object-to-view-space matrix in the vertex shader, the
CPU or any place?

A mesh submits itself to the render-queue once per sub-mesh, as well as an index to the sub-mesh. Since there will likely never be over 65,535 sub-meshes on a mesh this can save you 16 (on 32-bit systems) or 48 (on 64-bit machines) bits over using a pointer.
For a mesh to draw itself it implements a pure virtual function which accepts the index to the sub-mesh to draw and any flags the mesh wanted to be passed back to it for the render (among other things at your discretion).


In your implementation, after the render queues are sorted, do you process the render queues once per light source?

http://www.bytenjoy.com | [twitter]Bytenjoy[/twitter]

Do you calculate the object-to-view-space matrix in the vertex shader

Except in rare special cases, never concatenate matrices in a vertex shader. In all typical cases, all matrices should be sent to the shaders fully formed.

My model-view matrix is created on the CPU.


In your implementation, after the render queues are sorted, do you process the render queues once per light source?

There is no reason to make a single pass for just a single light. Always render as many lights as possible in a single pass.
This of course applies to forward rendering.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

A model has only a single transform. That transform can’t be used to place wheels in the correct locations or perform skinning.
Each mesh is a modular unit and needs to have its own transform. A model is really nothing but a container for meshes and a single primary world position. From there, each mesh has a local transform and a world transform based on each parent actor. The model hierarchy takes advantage of the parent/child relationship (the scene graph) already present in the actor class.
There is nothing on a model that needs to be set in shaders.

That's a point of view, in my engine I call mesh an array of node which contains geometry or not and each geometry contains subsets. Each geometry contains an array of bone which is an index of a node of the mesh, each node can be a bone (geometry or not).

In your implementation, after the render queues are sorted, do you process the render queues once per light source?

You should consider to look at clustered shading which gives an efficient forward rendering (and can be used on a deferred opaque pass).

You render each mesh and add the subset of the geometry in the array if transparent, then you do a quicksort + insertion sort and you have your back-to-front array ready to render.

The first pass (opaque) can be done using the clustered deferred shading and the second pass (transparents) after the sort back-to-front can be done using the clustered forward shading.

Using Direct3D12 and GLNext since that's low-level architecture, a good performance on order independent transparency will be possible, the back-to-front pass is then useless.

class Render_queue_set {

Render_queue opaque_queue;
Render_queue alpha_queue;
};

In case you don't know, sort additive blending is not needed, you can have 3 array to avoid sort them.

But since transparent is not a lot percent of the rendering, on most of case, you can just have one array with a constant size (a pool), to avoid allocation.

---

One last thing about transparency is back-to-front is not accurate at all, that's why order independent transparency is a current research.

If the geometry is big and overlap other geometry, the result of back-to-front will be bad on parts of the object.

That's a point of view, in my engine I call mesh an array of node which contains geometry or not and each geometry contains subsets. Each geometry contains an array of bone which is an index of a node of the mesh, each node can be a bone (geometry or not).

I was keeping my description short for the sake of clarity.
A model is a container for actors. A mesh is an actor, a bone is an actor, a point light is an actor, even a “group” is an actor. You know how in Maya you can select a bunch of objects and hit Ctrl-G to group them together, and then you can scale that group or move it and all the objects in the group scale or move too? You need that information at run-time to run a fully dynamic animation system properly.

All of these things are actors, so they can be parented. A model acts as the “root node”. A model is also an actor, so if you have a sword model you can make a joint in a character as its parent, effectively putting the sword in the character’s hand.
Etc.

Although the hierarchy of parents and children is enough to find anything inside the model, the model obviously also keeps a linear dictionary of objects as well, so you can run over only the meshes or only the lights or only the groups, etc.


Of course there was a lot missing in my explanation. The whole system is quite complex, but we don’t need to worry about that when we are focusing on render-queues.


But since transparent is not a lot percent of the rendering, on most of case, you can just have one array with a constant size (a pool), to avoid allocation.

The render-queues never deallocate their memory (until a specific point in the game, such as when changing states or scenes), so this isn’t a concern.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

Ok yeah, my engine works using actor<->actor-component, apparently you only do a hierarchy of actor from what you said.

You seem to be missing the base theory on which L. Spiro built his posts/improvements.

The article Order your draw calls around from 2008 should shed light on your questions.

Good addition, that article was also the base for my 'keys' approach

Crealysm game & engine development: http://www.crealysm.com

Looking for a passionate, disciplined and structured producer? PM me

This topic is closed to new replies.

Advertisement