Understanding glDrawElementsInstanced

Started by
2 comments, last by calioranged 4 years, 11 months ago

I am learning how to use glDrawElementsInstanced() based on this tutorial:

The geometry renders without any problems:

Geom.png.71bde22171b7464006fb08ff076e670e.png

But I am struggling to understand how this function actually works. 

Questions:

1). When calling glVertexAttribPointer(), the data from the transformation matrices is sent down 4 floats at a time (due to the fact that each attribute can contain no more than 4 components), meaning that the transformation matrix data is spread across 4 attributes (2,3,4,5 - 1 column of the matrix per attribute) . The stride is set the sizeof(glm::mat4) which is 64 bytes. But being that in this case, 4 floats make up 1 attribute, shouldn't the stride for attributes 2,3,4,5 be set to (sizeof(float) * 4)

2). Following on from the previous question; in the vertex shader, the attribute transform_matrix is said to be at location 2, rather than spread across 2,3,4,5. That being the case:

a). How does GLSL interpret this? The attribute is received in the shader as a mat4 at location 2, but as previously mentioned, the attributes are actually spread across the locations 2,3,4,5 as floats so how is it that GLSL was able to condense these 4 floats across 4 attributes to just 1 mat4 across 1 attribute?

b). How does the shader know when to switch from the first transformation matrix (index 0 of transform_matrix array) to the second transformation matrix (index 1 of transform_matrix array) in order to render the two geometries in different places (as show in above image)? I assume this has something to do with the call to glDrawElementsInstanced() but I'm not sure as to how exactly this would work in the shader. Maybe someone can provide some detail on how this process works?  

See the relevant source code below: 


// Main.cpp:

std::array<glm::mat4, 2> transform_matrix =
{
  MVP.projection.matrix * glm::translate(glm::mat4(1.0F), glm::vec3(-0.8F, 0.0F, -3.0F)) // 0
    * glm::rotate(glm::mat4(1.0F), glm::radians(24.0F), glm::vec3(1.0F, 0.0F, 0.0F)),

  MVP.projection.matrix * glm::translate(glm::mat4(1.0F), glm::vec3(2.2F, 0.0F, -3.8F))  // 1
    * glm::rotate(glm::mat4(1.0F), glm::radians(115.0F), glm::vec3(0.0F, 1.0F, 0.0F))
};

unsigned int transform_vbo;
glGenBuffers(1, &transform_vbo);
glBindBuffer(GL_ARRAY_BUFFER,transform_vbo);
glNamedBufferData(transform_vbo, sizeof(transform_matrix), &transform_matrix, GL_STATIC_DRAW);

//  positions (0) and colour (1) attributes already sent to OpenGL:
for (int i = 2; i < 6; i++)
{
  glEnableVertexAttribArray(i);
  
  glVertexAttribPointer(i, 4, GL_FLOAT, GL_FALSE, sizeof(glm::mat4), 
  (const void*)(sizeof(float) * ((i - 2)*4)));
  
  glVertexAttribDivisor(i, 1);
}

Cube_VAO.Bind();

while (!glfw.WindowShouldClose())  
{
  shader.ClearBuffers(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
  glfw.ResizeWindow(MVP);		
  glDrawElementsInstanced(GL_TRIANGLES, CubeIndices.size(), GL_UNSIGNED_SHORT, 0, 2);
  glfw.SwapBuffers(); 
}

// Shader.glsl:

#vertex shader
#version 330 core

layout(location = 0) in vec4 position;
layout(location = 1) in vec4 colours;
layout(location = 2) in mat4 transform_matrix;

out vec4 colour_transfer;

void main()
{
	gl_Position = transform_matrix * position;
	colour_transfer = colours;
};

#fragment shader
#version 330 core

in vec4 colour_transfer;
out vec4 colour;

void main()
{
	colour = colour_transfer;
};

..

Advertisement

This is related to how the explicit uniform locations work as described here:

https://www.khronos.org/opengl/wiki/Uniform_(GLSL)

The key points here are that in general uniforms are of size 16 bytes. If you use smaller types then they are essentially padded (this is not an issue for uniforms, but as soon as you start working with constant buffers or shader storage buffers then this becomes a thing. For more information see std140 vs std430, for example here: https://www.khronos.org/opengl/wiki/Interface_Block_(GLSL) or read it from the khronos specs https://www.khronos.org/registry/OpenGL/specs/gl/).

This implies that your mat4 actually takes up 4 slots of vec4 and the compiler just marks location = 2 of type mat4 and marks location 3,4,5 as used so you can't use it again (you'll get a compile error as described in the Uniform article under "This also means that explicit uniform location ranges cannot overlap."). A side note here is that in OpenGL the shader reflection takes into account that you have the proper glUniform function for the data type in the shader. E.g. calling glUniform4fv is cannot be called on types in the shader of say vec2. Note that in other APIs, e.g. DX9-DX12 or Vulkan this is not so much an issue. 

A mat4 is essentially vec4[4] or float[4][4] so its size is 4 * 4 * sizeof(float) which is 64 bytes. The stride of the 4 elements you pass in still needs to be this 64 bytes, because if you had laid out 2 matrices in memory, lets call them A and B then they would be stored as 


Offset	Element
0	A[0]
16 	A[1]
32	A[2]
48	A[3]
64	B[0]
80	B[1]
96	B[2]
112	B[3]

And your glVertexAttribPointer offset from A[0] to B[0] is still 64 bytes, not 16 bytes as that would make the second instance index A[1] as opposed to B[0].

Quote

 

How does the shader know when to switch from the first transformation matrix (index 0 of transform_matrix array) to the second transformation matrix (index 1 of transform_matrix array) in order to render the two geometries in different places (as show in above image)?


 

This is controlled by the glVertexAttribDivisor, but is better illustrated with an example. Let say that you draw a single quad (2 triangles) with indices, 0, 1, 2, 0, 2, 3. Well with glDrawElementsInstanced(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, indices, 2) what basically happens is that the shader uses an instance index that increments once for every 'full set of indices'. So

Instance Index = 0, Vertex indices 0, 1, 2 - 0, 2, 3

Instance Index = 1, Vertex indices 0, 1, 2 - 0, 2, 3

It basically functions like the pseudo code outlined in: https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glDrawElementsInstanced.xhtml


glDrawElementsInstanced has the same effect as:

    if (mode, count, or type is invalid )
        generate appropriate error
    else {
        for (int i = 0; i < instancecount ; i++) {
            instanceID = i;
            glDrawElements(mode, count, type, indices);
        }
        instanceID = 0;
    }

Where the shader uses the instanceID. But since you only set the glVertexAttribDivisor for the streams that have instance data, the position and colours stream will just repeat in place.

From: https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glVertexAttribDivisor.xhtml
 

Quote

 

glVertexAttribDivisor modifies the rate at which generic vertex attributes advance when rendering multiple instances of primitives in a single draw call. If divisor is zero, the attribute at slot index advances once per vertex. If divisor is non-zero, the attribute advances once per divisor instances of the set(s) of vertices being rendered.


 

Lets say that you only had 1 triangle, instances with just position and matrix. then


Instance 0
vertex 0.position, 0.matrix
vertex 1.position, 0.matrix
vertex 2.position, 0.matrix

Instance 1
vertex 0.position, 1.matrix
vertex 1.position, 1.matrix
vertex 2.position, 1.matrix

This is because for position you don't explicitly set glVertexAttribDivisor (and hence increments for every index), but glVertexAttribDivisor is set to 1 for the matrix stream which means the index only increments for the count you passed in to glDrawElementsInstanced. 

I hope this helps.

 

45 minutes ago, deadc0deh said:

I hope this helps.

Thanks a lot for that clarification!

This topic is closed to new replies.

Advertisement