Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 19 Oct 2009
Offline Last Active Yesterday, 04:37 AM

Topics I've Started

[Design] Meshes and Vertex buffers

02 February 2016 - 12:18 PM



I'm currently making a stateful render-queue based, graphics API independent system, here are the basics:


Given a Scene with some renderable meshes (later with space partitioning). The scene itself is just a "container" for the renderables and other types (cameras, sounds, etc.). Also given a Renderer base class. The DeferredRenderer, ForwardRenderer, etc. are inherited from this class. Each renderer instance can read the scene data but not allowed to modify it. The renderer is the one which makes the actual graphics calls, like set shader, set parameters, draw call, etc. These calls are stored in a simple linear list and sent to the graphics API for rendering.


So for each frame:

1) The scene collects all visible meshes into a list (from scratch, but with some pooling to avoid memory allocation/deallocation)

1.1) There are at least 2 lists: 1 for opaque and 1 for transparent meshes.

2) Each renderer has its own list of visible meshes which are acquired from the scene

2.1) The renderer sorts that list based on its own needs

2.2) For example the Transparent renderer sort the list based on distance only, while the DeferredRenderer sorts based on material, etc.

3) Then the renderer sets global graphics states (like the GBuffer shader and its parameters) <-- for example this is why not stateless (nice article here for a stateless renderer)

4) The renderer iterates over the sorted list and inserts the graphics calls into a RenderQueue

5) The render queue is sent to the graphics API.


This could be done better probably, but I like this approach, and I will see if it's viable or not.


The Actual Question


However my main problem is with the actual meshes and vertex/index buffers. Long time ago I've created a vertex buffer for each mesh (where the mesh means a collection of vertices (array of struct) and indices) and that's it. But the static (and dynamic) batching sounds cool and this is not the best solution anyway.


For now I have:

MeshVertex struct which contains every possible vertex attribute (position, normal, texcoord, etc.etc.)

Mesh class with the following members:

- list of vertices (MeshVertex)

- list of indices

- flags for each vertex attribute: the attributes can be marked as: NotUsed and Used and can be calculated (the normals and tangents)


So my problems:

- somehow I have to build vertex and index buffers <-- static, dynamic batching, but handling more instance of the same mesh (without duplicate the vertex buffer) and removing of a mesh.

- the buffers depend on the actual vertex data and the usage flags (I'm using interleaved arrays for vertex buffers) <-- the actual data stored in the GRAM is filled from the mesh data based on the usage flags. Also a VertexDeclaration is created (and cached) which determines the attributes (offset, size, type, etc.)

- however the shader determines the required vertex attributes <-- I'm not sure what happens when the shader tries to read an attribute which is not currently bound and set properly.

- some renderers (like ShadowMapRenderer) does not need any attribute except the position. Or a 2D renderer does not necessarily need positions as a 3D vector. <-- but if I create only 1 buffer (doesn't matter it's batched or not) for the meshes, every renderer have to use the same buffer(s).


I know it's a bit long story, but I hope you can help me. smile.png

UE4-like shaders from materials (theory)

05 March 2015 - 01:09 AM



I've just checked out the new UE4 (Unreal Eninge 4) and its material editor and I'm thinking how it works.

Is a new shader compiled for each material? If yes, isn't the lots of shader switches affecting the performance? How can it be optimized?



According to this paper, the material is compiled into shader code.


This node graph specifies inputs (textures, constants), operations, and outputs, which are compiled into shader code

Scene Graph + Visibility Culling + Rendering

11 November 2014 - 04:49 AM

I wanted to improve my rendering with visibility culling, so I've redesigned the rendering system.
Sorry for the long post, the real question will be short. :)
A Mesh contains information about its parent-container (called Drawable, which contains the other components, like animation controller), its buffer (VBO) and its material. So everything which is necessary to the rendering.
In the previous system, when I added a new Drawable to the scene, I sorted the meshes by shader and material and used that buffer for rendering.
However, the problem comes with the visibility culling. I wanted a simple uniform grid (working on a top-down game) which makes the culling faster and easier: if a grid node is visible then add its meshes to the render queue. This means that I can't sort the meshes when I add them to the scene, I have to do it per frame.
So here's what I'm doing right now:
- clear render queue
- get visible nodes
- for all nodes
    - for all drawables in the node:
        - check if the meshes are already in the render queue
        - if not, add them to the proper part of the render queue
For the render queue, I'm using a std::set array. The array is indexed by the shader (more precisely by the shading type - diffuse, normal, etc) and the set contains the meshes sorted by textures. It could be faster to use vector then sort it, I don't know.
Is there any better solution for the storage or the overhead of the clearing and inserting is not too bad? How are you doing it?

Deferred Point Lights position error [SOLVED]

26 July 2014 - 04:21 PM



Another topic from me smile.png


I've just discovered that the position reconstruction is not completely fine in my deferred renderer. Here is the steps I'm doing:

- render gbuffer -> albedo, normals, viewspace depth (32 bit float texture)

- for each point light

    - render a sphere geometry (which is a littlebit larger than the radius of the light)

    - the cull mode is always CW, so I'm drawing the backfaces only

    - position reconstruction is done in the fragment shader by: eyePosition + eyeRay * depth

    - eyeRay is computed in the vertex shader using an eye correction (described here)


The problem comes when the camera intersects the sphere geometry. The reconstructed position inside the sphere is okay, but at the edge, I get something wierd. This means the attenuation calculation is wrong, and the objects will be lit outside of the sphere.


Here is the shader:

// in the vertex shader

varying     vec3    EyeRay;

uniform     vec3    eyePosition;
uniform     float   farPlane;
uniform     vec3    cameraFront;


vec3 eyeRay = normalize(worldPos.xyz - eyePosition);
float eyeCorrection = farPlane / dot(eyeRay, cameraFront);
EyeRay = eyeRay * eyeCorrection;

// in the fragment shader

vec2 screenPos = ScreenPos.xy / ScreenPos.w;
vec2 Texcoord = 0.5 * (screenPos + 1.0);

float depthVal = texture2D(textDepth, Texcoord).r;

// get world position
vec3 wPosition = eyePosition + EyeRay * depthVal;

gl_FragData[0] = vec4(wPosition, 1.0);

The result is attached. You can see the green area at the edge of the sphere. I don't know if the stencil-optimization would resolve this problem, but it should work without it.

[CSM] Cascaded Shadow Maps split selection

25 July 2014 - 11:14 AM



I'm working on a CSM and I don't know which way should I choose. I'm using a geometry prepess which gives me a depth map from the camera's view, so I'm using a separate fullscreen pass to compute the shadows (so I can't use the clip planes as a solution)



- create [numSplits] render target, render each shadow map to the right buffer

- switch to shadow calculation pass

- bind every texture to the shader

- in the shader, use dynamic branching, like

if (dist < split1) { texture2D(shadowmap1, texcoord); ... }



- create only one render target and draw the shadow maps as a texture atlas (left-top is the first split, right-top is the second, etc...)

- switch to shadow calculation pass

- bind the only one texture

- in the shader, use dynamic branching, which calculates the texcoords where the shadow map should be sampled.


And here comes my problems with both way. The target platform is OpenGL 2.0 (think to SM2).



If I know well, the dynamic branching in a shader under SM3 is a "fake" solution. So it will compute every branch and evaluate after. It won't be so fast to compute shadows for each split then make the decision later. Especially, I'm using PCF, and in SM2, the instruction count is not infinite. smile.png



With 4 splits and 1024 shadow maps, the texture size would be 2048x2048, And maybe this is the best case... imagine 2048 shadow maps which would use 4096x4096 texture.


However the 2nd solution still looks more viable. But I'm not sure about the texture arrays in OpenGL 2, is it available?