Sign in to follow this  
KaiserJohan

Light-weight render queues?

Recommended Posts

KaiserJohan    2317

I've got a scene and a renderer. In order to draw models, my scene iterates all its high-level models, materials, etc and compile it into a structure (my renderqueue) and is sent of to the renderer which draws it. It has all it needs to draw the scene.

    struct RenderableMesh
    {
        MeshID mMeshID;
        Mat4 mWVPMatrix;
        Mat4 mWorldMatrix;
    };
    
    typedef std::vector<RenderableMesh> RenderableMeshes;

    struct RenderableMaterial
    {
        TextureID mDiffuseTextureID;
        TextureID mNormalTextureID;
        float mSpecularFactor;
    };

    struct RenderableModel
    {
        RenderableMesh mMesh;
        RenderableMaterial mMaterial;
        float mTextureTilingFactor;
    };

    typedef std::vector<RenderableModel> RenderableModels;

    struct RenderableCamera
    {
	Mat4 mCameraViewMatrix;
        Mat4 mCameraProjectionMatrix;
	Vec3 mCameraPosition;
        float mFOV;

        RenderableModels mModels;
    };

    struct RenderableDirLight
    {
        Vec4 mLightColor;
        Vec3 mLightDirection;

        RenderableMeshes mMeshes;
    };
    
    typedef std::vector<RenderableDirLight> RenderableDirLights;

    struct RenderablePointLight
    {
	enum POINT_LIGHT_DIR
	{
		POINT_LIGHT_DIR_POS_X = 0,
		POINT_LIGHT_DIR_NEG_X,
		POINT_LIGHT_DIR_POS_Y,
		POINT_LIGHT_DIR_NEG_Y,
		POINT_LIGHT_DIR_NEG_Z,
		POINT_LIGHT_DIR_POS_Z,
		POINT_LIGHT_DIR_COUNT
	};

        Mat4 mWVPMatrix;
        Vec4 mLightColor;
        Vec3 mLightPosition;

        float mLightIntensity;
        float mMaxDistance;

	std::array<RenderableMeshes, POINT_LIGHT_DIR::POINT_LIGHT_DIR_COUNT> mMeshes;
        std::array<Mat4, POINT_LIGHT_DIR::POINT_LIGHT_DIR_COUNT> mFaceWVPMatrices;
    };
    
    typedef std::vector<RenderablePointLight> RenderablePointLights;

    struct RenderQueue
    {
        void Clear();


        Vec4 mAmbientLight;

        RenderableCamera mCamera;
        RenderableDirLights mDirectionalLights;
        RenderablePointLights mPointLights;
    };

I can identify the following problems:

  • The renderqueue is quite big. The worst offenders are probably the transforms.
  • vectors of vectors. Lots of different locations in memory.
  • Duplication of data; the color of a light also exists in high-level scene objects.

I can think of a few alternatives, like keeping all the transforms regardless of node hierarchy in a contigous storage with stable indices and pass around just the indices instead.

 

I'm mostly just looking for ideas to improve it or different solutions. I'd like to keep the two systems as separate as possible with the bridge between them being the renderqueue.

Share this post


Link to post
Share on other sites
AgentC    2352

Is it a sound idea to do view frustrum culling for all 6 faces of a point light? For example, my RenderablePointLight has a collection of meshes for each face.

 

Is this about a shadow-casting point light which renders a shadow map for each face?

 

If your culling code has to walk the entire scene, or a hierarchic acceleration structure (such as quadtree or octree) it will likely be faster to do one spherical culling query to first get all the objects associated with any face of the point light, then test those against the individual face frustums. Profiling will reveal if that's the case.

 

If it's not a shadow-casting light, you shouldn't need to bother with faces, but just do a single spherical culling query to find the lit objects.

Share this post


Link to post
Share on other sites
cozzie    5029
Just some pointers/ what I do.

- a scene has mesh instances and meshes
- a mesh instance points to a mesh
- a mesh instance has a transform matrix and a flag "dynamic"
- a mesh instance consists of renderables
- a renderable has a transform matrix, material id and also a flag "dynamic"

Transformations:
- loop through ALL mesh instances and update transforms if dynamic is set (true)
- loop through renderables within instances and update only if dynamic
- if parent is dynamic and it's renderables aren't then I dont update transforms of renderables, because it's a transform relative to parent (mesh instance)
- a renderable has a vector of ints with point light id's that affect the renderable (updated if objects and or lights are dynamic)

Queue:
- I defined a renderable struct, keeping track for each renderable of: mesh id, instance id, renderable id, material id, dist to camera
- this is stored as a bitkey (uint64) for easy sorting as you want

Rendering:
- cull all parents and if intersect also cull children (renderables)
- cull lights and update everything
- loop through all renderables and if visible add their id to a simple int vector
- simply render all renderables pointed to in thenint (index) vector

This gives great flexibility for sorting, for example I have 2 int indices, one for blended to, where I sort the bit keys based on distance to camera (back to front).

Hope this helps.

Share this post


Link to post
Share on other sites
KaiserJohan    2317

 

Is it a sound idea to do view frustrum culling for all 6 faces of a point light? For example, my RenderablePointLight has a collection of meshes for each face.

 

Is this about a shadow-casting point light which renders a shadow map for each face?

 

If your culling code has to walk the entire scene, or a hierarchic acceleration structure (such as quadtree or octree) it will likely be faster to do one spherical culling query to first get all the objects associated with any face of the point light, then test those against the individual face frustums. Profiling will reveal if that's the case.

 

If it's not a shadow-casting light, you shouldn't need to bother with faces, but just do a single spherical culling query to find the lit objects.

 

 

Yeah its for shadowmapping. How do you do a spherical culling query?

Share this post


Link to post
Share on other sites
AgentC    2352

Yeah its for shadowmapping. How do you do a spherical culling query?

 

If your renderables are represented by AABBs for culling, you'd do a sphere-AABB intersection (just replace frustum with sphere.)

Share this post


Link to post
Share on other sites
KaiserJohan    2317

 

I've got a scene and a renderer. In order to draw models, my scene iterates all its high-level models, materials, etc and compile it into a structure (my renderqueue) and is sent of to the renderer which draws it. It has all it needs to draw the scene.

struct RenderableMesh
    struct RenderableMaterial
    struct RenderableModel
    struct RenderableCamera
...
All those structures are very specific, removing flexibility from what can possibly be drawn using the back-end.
e.g. The per-mesh constant data, or the number and names of the per-material textures are all hard-coded.
If later you're implementing a technique where a mesh needs more values, or its own look-up-table texture, or same for a material, etc, you're stuck... Ideally you want the data that makes up a material (the number of textures, their names, etc) to be completely data-driven instead of being hard-coded.

I prefer to keep my back-end completely generic and unaware of what the high level data is. No hardcoding of materials, lights, shader inputs, etc...
//Device has resource pools, so we can use small ids instead of pointers
ResourceListId: u16
TextureId : u16
CBufferId: u16
SamplerId: u16

//n.b. variable sized structure
ResourceList: count, TextureId[count]

DrawState: u64 - containing BlendId, DepthStencilId, RasterId, ShaderId, DrawType (indexed, instanced, etc) NumTexLists, NumSamplers, NumCbuffers.

//n.b. variable sized structure
InputAssemblerState: IaLayoutId, IndexBufferId, NumBuffers, BufferId[NumBuffers]

DrawCall: Primitive, count, vbOffset, ibOffset, StencilRef. 

//n.b. variable sized structure
DrawItem: DrawState, InputAssemblerState*, TexListId[NumTexLists], SamplerId[NumSamplers], CbufferId[NumCbuffers], DrawCall 

DrawList: vector<DrawItem*>
A generic model/mesh class can then have a standard way of generating it's internal DrawItems. A terrain class can generate it's DrawItem in a completely different way.
Debug UI's can freely generate DrawItem's on demand for immediate mode style usage, etc...

I build sort keys by combining some data from the DrawState(u64) with a hash of the DrawItem's resource arrays.

 

 

How do you differentiate in rendering code for example between normal and diffuse textures?

Share this post


Link to post
Share on other sites
KaiserJohan    2317

 

How do you differentiate in rendering code for example between normal and diffuse textures?

You typically don't. All that matters to the program is to which texture unit each is bound.

 

 

That's what I don't understand. Constant buffers, texture slots, samplers, drawtypes, depthstencil buffers etc dosn't sound like "high-level data". A texture unit or slot for example sounds like something privy to the renderer rather than a high-level scene object. What am I missing?

Share this post


Link to post
Share on other sites
cozzie    5029

A model has multiple renderables, a renderable is linked to a material.

Then based on that material, you just set a diffuse map or if applicable for that material, also another map like normal map etc.

Share this post


Link to post
Share on other sites
haegarr    7372


That's what I don't understand. Constant buffers, texture slots, samplers, drawtypes, depthstencil buffers etc dosn't sound like "high-level data". A texture unit or slot for example sounds like something privy to the renderer rather than a high-level scene object. What am I missing?

Constant buffers, texture slots, depthstencil buffers, ... are operating resources (hence resources not in the sense of assets). If you have "high-level data" like material parameters or viewing parameters or whatever is constant for a draw call, they can be stored within a constant buffer to provide them to the GPU. From a high-level view it's the data within the buffer that is interesting, not the buffer which is only the mechanism to transport it. From a performance point of view, it's the transport mechanism that is interesting, not the data within. Same for textures.

 

With programmable shaders the meaning of vertex attributes, constant parameters, or texture texels is not pre-defined. It is just how the data is processed within a shader script that gives the data its meaning. To give a clue of how it is processed, the data is marked with a semantic.

 

Now, does the renderer code need to know what a vertex normal, a bump map, or a viewport is? In exceptional cases perhaps, but in general it need not. It just need to know which block of data need to be provided as which resource (by its binding point / slot / whatever). The renderer code does not deal with high level data, it deals with the operating resources. That is what swiftcoder says.

 

State parameters for the fixed parts of the GPU pipeline are different since they need to be set by the renderer code explicitly.

Share this post


Link to post
Share on other sites
Hodgman    51345

How do you differentiate in rendering code for example between normal and diffuse textures?

You typically don't. All that matters to the program is to which texture unit each is bound.

 
That's what I don't understand. Constant buffers, texture slots, samplers, drawtypes, depthstencil buffers etc dosn't sound like "high-level data". A texture unit or slot for example sounds like something privy to the renderer rather than a high-level scene object. What am I missing?

First up, there's two approaches to render queues. It's perfectly fine to choose to have a high level one, where you have a mesh/material/instance ID/pointer... I just prefer to have a low level one that more closely mimics the device, as this gives me more flexibility.

Assuming a low-level one, the back-end doesn't care what the data is being used for, it just plugs resources into shaders' slots, sets the pipeline state, and issues draw-calls.
So for easy high-level programming, you need another layer above the queue, defining concepts like "model" and "material".

At the layer above the queue itself, there's two strategies for assigning resources to slots:
1) Convention based. You simply define a whole bunch of conventions, and stick to them everywhere.
e.g.
- The WorldViewProjection matrix is always in a CBuffer in register b0.
- The diffuse texture is always in register t0.
- A linear interpolating sampler is always in register s0.
On the shader side, you can do this by having common include files for your shaders, so that every shader #includes the same resource definitions.
On the material/model side, you then just hard-code these same conventions -- if the material has diffuse texture, put it into texture slot #0.

2) Reflection based. Use an API to inspect your shaders, and see what the name and slot of each resource is.
Write your shaders however you want (probably still using #includes for common cbuffer structures though!), and then on the material side, when binding a texture, get the name from the user (e.g. "diffuse"), and then use the reflection API to ask the shader which slot this texture should be bound to.
Likewise but slightly more complex for CBuffers. When a user tries to set a constant value (e.g. "color"), search the reflection API to find which cbuffer contains a "color" member. If the material doesn't already have a clone of that cbuffer, make one now (by copying the default values for that cbuffer). Then use the reflection API to find the offset/size/type of the color variable, and copy the user-supplied value into the material's instance of that cbuffer type. Then make sure the material binds it's new cbuffer to the appropriate slot.

I use #1 for engine-generated data, such as camera matrices, world matrices, etc... and I use #2 for artist-generated data, such as materials.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this