Back to General and Gameplay Programming

Light-weight render queues?

General and Gameplay Programming Programming

Started by KaiserJohan January 31, 2015 05:40 PM

12 comments, last by Hodgman 9 years, 2 months ago

KaiserJohan

2,320

Author

January 31, 2015 05:40 PM

I've got a scene and a renderer. In order to draw models, my scene iterates all its high-level models, materials, etc and compile it into a structure (my renderqueue) and is sent of to the renderer which draws it. It has all it needs to draw the scene.


    struct RenderableMesh
    {
        MeshID mMeshID;
        Mat4 mWVPMatrix;
        Mat4 mWorldMatrix;
    };
    
    typedef std::vector<RenderableMesh> RenderableMeshes;

    struct RenderableMaterial
    {
        TextureID mDiffuseTextureID;
        TextureID mNormalTextureID;
        float mSpecularFactor;
    };

    struct RenderableModel
    {
        RenderableMesh mMesh;
        RenderableMaterial mMaterial;
        float mTextureTilingFactor;
    };

    typedef std::vector<RenderableModel> RenderableModels;

    struct RenderableCamera
    {
	Mat4 mCameraViewMatrix;
        Mat4 mCameraProjectionMatrix;
	Vec3 mCameraPosition;
        float mFOV;

        RenderableModels mModels;
    };

    struct RenderableDirLight
    {
        Vec4 mLightColor;
        Vec3 mLightDirection;

        RenderableMeshes mMeshes;
    };
    
    typedef std::vector<RenderableDirLight> RenderableDirLights;

    struct RenderablePointLight
    {
	enum POINT_LIGHT_DIR
	{
		POINT_LIGHT_DIR_POS_X = 0,
		POINT_LIGHT_DIR_NEG_X,
		POINT_LIGHT_DIR_POS_Y,
		POINT_LIGHT_DIR_NEG_Y,
		POINT_LIGHT_DIR_NEG_Z,
		POINT_LIGHT_DIR_POS_Z,
		POINT_LIGHT_DIR_COUNT
	};

        Mat4 mWVPMatrix;
        Vec4 mLightColor;
        Vec3 mLightPosition;

        float mLightIntensity;
        float mMaxDistance;

	std::array<RenderableMeshes, POINT_LIGHT_DIR::POINT_LIGHT_DIR_COUNT> mMeshes;
        std::array<Mat4, POINT_LIGHT_DIR::POINT_LIGHT_DIR_COUNT> mFaceWVPMatrices;
    };
    
    typedef std::vector<RenderablePointLight> RenderablePointLights;

    struct RenderQueue
    {
        void Clear();


        Vec4 mAmbientLight;

        RenderableCamera mCamera;
        RenderableDirLights mDirectionalLights;
        RenderablePointLights mPointLights;
    };

I can identify the following problems:

The renderqueue is quite big. The worst offenders are probably the transforms.
vectors of vectors. Lots of different locations in memory.
Duplication of data; the color of a light also exists in high-level scene objects.

I can think of a few alternatives, like keeping all the transforms regardless of node hierarchy in a contigous storage with stable indices and pass around just the indices instead.

I'm mostly just looking for ideas to improve it or different solutions. I'd like to keep the two systems as separate as possible with the bridge between them being the renderqueue.

SeanMiddleditch

17,596

January 31, 2015 09:58 PM

Your transforms can be smaller. The actual math you're doing on the GPU certainly doesn't need a 4x4 matrix.

I don't see any vectors of vectors in your code. The lights are fatter than they need to be, but a vector of things with arrays in them is fine, assuming you actually need the data in the array (I can't figure out what yours are for or why you want them).

The data duplication is a feature, not a problem. It allows the renderer to copy that data into a queue and then allow the scene to continue moving or animating while the queue is still be dispatched to the GPU. Copying data is a good way of solving parallelism problems, and it's really really cheap.

Typical render queues just use the same structure over for every queue but have many of them. A renderable is a mesh, a material, and instance data (like a transform). That's true for an opaque renderable, translucent renderable, a shadow caster, etc. The scene graph just needs to make it possible to efficiently walk the graph and stick renderables into the appropriate queue. Most of the queues will have some kind of strict ordering to them. Lights are often walked in a completely separate pass since (a) the path through the graph you talk will be different for the light than for the player view, (b) you may want to cull out lights before finding all their shadow casters, and (c) you probably want to parallelize the work. The code that walks for general rendering might know to toss the mesh into buckets like opaque or translucent_ordered or translucent_unordered while the code that walks for a light is just tossing things into opaque_shadow or translucent_shadow or the like and tosses in the shadow material instead of the regular material.

Sean Middleditch – Game Systems Engineer – Join my team!

KaiserJohan

2,320

Author

February 01, 2015 01:57 AM

Is it a sound idea to do view frustrum culling for all 6 faces of a point light? For example, my RenderablePointLight has a collection of meshes for each face.

AgentC

2,476

February 01, 2015 11:15 AM

Is it a sound idea to do view frustrum culling for all 6 faces of a point light? For example, my RenderablePointLight has a collection of meshes for each face.

Is this about a shadow-casting point light which renders a shadow map for each face?

If your culling code has to walk the entire scene, or a hierarchic acceleration structure (such as quadtree or octree) it will likely be faster to do one spherical culling query to first get all the objects associated with any face of the point light, then test those against the individual face frustums. Profiling will reveal if that's the case.

If it's not a shadow-casting light, you shouldn't need to bother with faces, but just do a single spherical culling query to find the lit objects.

Github: https://github.com/cadaver C64 development: http://covertbitops.c64.org/

cozzie

5,063

February 01, 2015 11:54 AM

Just some pointers/ what I do.

- a scene has mesh instances and meshes
- a mesh instance points to a mesh
- a mesh instance has a transform matrix and a flag "dynamic"
- a mesh instance consists of renderables
- a renderable has a transform matrix, material id and also a flag "dynamic"

Transformations:
- loop through ALL mesh instances and update transforms if dynamic is set (true)
- loop through renderables within instances and update only if dynamic
- if parent is dynamic and it's renderables aren't then I dont update transforms of renderables, because it's a transform relative to parent (mesh instance)
- a renderable has a vector of ints with point light id's that affect the renderable (updated if objects and or lights are dynamic)

Queue:
- I defined a renderable struct, keeping track for each renderable of: mesh id, instance id, renderable id, material id, dist to camera
- this is stored as a bitkey (uint64) for easy sorting as you want

Rendering:
- cull all parents and if intersect also cull children (renderables)
- cull lights and update everything
- loop through all renderables and if visible add their id to a simple int vector
- simply render all renderables pointed to in thenint (index) vector

This gives great flexibility for sorting, for example I have 2 int indices, one for blended to, where I sort the bit keys based on distance to camera (back to front).

Hope this helps.

Crealysm game & engine development: http://www.crealysm.com

Looking for a passionate, disciplined and structured producer? PM me

KaiserJohan

2,320

Author

February 01, 2015 08:25 PM

Is it a sound idea to do view frustrum culling for all 6 faces of a point light? For example, my RenderablePointLight has a collection of meshes for each face.

Is this about a shadow-casting point light which renders a shadow map for each face?

If your culling code has to walk the entire scene, or a hierarchic acceleration structure (such as quadtree or octree) it will likely be faster to do one spherical culling query to first get all the objects associated with any face of the point light, then test those against the individual face frustums. Profiling will reveal if that's the case.

If it's not a shadow-casting light, you shouldn't need to bother with faces, but just do a single spherical culling query to find the lit objects.

Yeah its for shadowmapping. How do you do a spherical culling query?

Hodgman

52,717

February 02, 2015 12:49 AM

I've got a scene and a renderer. In order to draw models, my scene iterates all its high-level models, materials, etc and compile it into a structure (my renderqueue) and is sent of to the renderer which draws it. It has all it needs to draw the scene.
struct RenderableMesh
    struct RenderableMaterial
    struct RenderableModel
    struct RenderableCamera
...

All those structures are very specific, removing flexibility from what can possibly be drawn using the back-end.
e.g. The per-mesh constant data, or the number and names of the per-material textures are all hard-coded.
If later you're implementing a technique where a mesh needs more values, or its own look-up-table texture, or same for a material, etc, you're stuck... Ideally you want the data that makes up a material (the number of textures, their names, etc) to be completely data-driven instead of being hard-coded.

I prefer to keep my back-end completely generic and unaware of what the high level data is. No hardcoding of materials, lights, shader inputs, etc...

//Device has resource pools, so we can use small ids instead of pointers
ResourceListId: u16
TextureId : u16
CBufferId: u16
SamplerId: u16

//n.b. variable sized structure
ResourceList: count, TextureId[count]

DrawState: u64 - containing BlendId, DepthStencilId, RasterId, ShaderId, DrawType (indexed, instanced, etc) NumTexLists, NumSamplers, NumCbuffers.

//n.b. variable sized structure
InputAssemblerState: IaLayoutId, IndexBufferId, NumBuffers, BufferId[NumBuffers]

DrawCall: Primitive, count, vbOffset, ibOffset, StencilRef. 

//n.b. variable sized structure
DrawItem: DrawState, InputAssemblerState*, TexListId[NumTexLists], SamplerId[NumSamplers], CbufferId[NumCbuffers], DrawCall 

DrawList: vector<DrawItem*>

A generic model/mesh class can then have a standard way of generating it's internal DrawItems. A terrain class can generate it's DrawItem in a completely different way.
Debug UI's can freely generate DrawItem's on demand for immediate mode style usage, etc...

I build sort keys by combining some data from the DrawState(u64) with a hash of the DrawItem's resource arrays.

. 22 Racing Series .

AgentC

2,476

February 02, 2015 09:21 AM

Yeah its for shadowmapping. How do you do a spherical culling query?

If your renderables are represented by AABBs for culling, you'd do a sphere-AABB intersection (just replace frustum with sphere.)

Github: https://github.com/cadaver C64 development: http://covertbitops.c64.org/

KaiserJohan

2,320

Author

February 07, 2015 03:58 PM

I've got a scene and a renderer. In order to draw models, my scene iterates all its high-level models, materials, etc and compile it into a structure (my renderqueue) and is sent of to the renderer which draws it. It has all it needs to draw the scene.
struct RenderableMesh
    struct RenderableMaterial
    struct RenderableModel
    struct RenderableCamera
...
All those structures are very specific, removing flexibility from what can possibly be drawn using the back-end.
e.g. The per-mesh constant data, or the number and names of the per-material textures are all hard-coded.
If later you're implementing a technique where a mesh needs more values, or its own look-up-table texture, or same for a material, etc, you're stuck... Ideally you want the data that makes up a material (the number of textures, their names, etc) to be completely data-driven instead of being hard-coded.

I prefer to keep my back-end completely generic and unaware of what the high level data is. No hardcoding of materials, lights, shader inputs, etc...
//Device has resource pools, so we can use small ids instead of pointers
ResourceListId: u16
TextureId : u16
CBufferId: u16
SamplerId: u16

//n.b. variable sized structure
ResourceList: count, TextureId[count]

DrawState: u64 - containing BlendId, DepthStencilId, RasterId, ShaderId, DrawType (indexed, instanced, etc) NumTexLists, NumSamplers, NumCbuffers.

//n.b. variable sized structure
InputAssemblerState: IaLayoutId, IndexBufferId, NumBuffers, BufferId[NumBuffers]

DrawCall: Primitive, count, vbOffset, ibOffset, StencilRef. 

//n.b. variable sized structure
DrawItem: DrawState, InputAssemblerState*, TexListId[NumTexLists], SamplerId[NumSamplers], CbufferId[NumCbuffers], DrawCall 

DrawList: vector<DrawItem*>
A generic model/mesh class can then have a standard way of generating it's internal DrawItems. A terrain class can generate it's DrawItem in a completely different way.
Debug UI's can freely generate DrawItem's on demand for immediate mode style usage, etc...

I build sort keys by combining some data from the DrawState(u64) with a hash of the DrawItem's resource arrays.

How do you differentiate in rendering code for example between normal and diffuse textures?

swiftcoder

18,997

February 07, 2015 11:08 PM

How do you differentiate in rendering code for example between normal and diffuse textures?

You typically don't. All that matters to the program is to which texture unit each is bound.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Light-weight render queues?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Light-weight render queues?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines