Public Group

# OpenGL Understanding glDrawElementsInstanced

## Recommended Posts

Posted (edited)

I am learning how to use glDrawElementsInstanced() based on this tutorial:

The geometry renders without any problems:

But I am struggling to understand how this function actually works.

Questions:

1). When calling glVertexAttribPointer(), the data from the transformation matrices is sent down 4 floats at a time (due to the fact that each attribute can contain no more than 4 components), meaning that the transformation matrix data is spread across 4 attributes (2,3,4,5 - 1 column of the matrix per attribute) . The stride is set the sizeof(glm::mat4) which is 64 bytes. But being that in this case, 4 floats make up 1 attribute, shouldn't the stride for attributes 2,3,4,5 be set to (sizeof(float) * 4)

2). Following on from the previous question; in the vertex shader, the attribute transform_matrix is said to be at location 2, rather than spread across 2,3,4,5. That being the case:

a). How does GLSL interpret this? The attribute is received in the shader as a mat4 at location 2, but as previously mentioned, the attributes are actually spread across the locations 2,3,4,5 as floats so how is it that GLSL was able to condense these 4 floats across 4 attributes to just 1 mat4 across 1 attribute?

b). How does the shader know when to switch from the first transformation matrix (index 0 of transform_matrix array) to the second transformation matrix (index 1 of transform_matrix array) in order to render the two geometries in different places (as show in above image)? I assume this has something to do with the call to glDrawElementsInstanced() but I'm not sure as to how exactly this would work in the shader. Maybe someone can provide some detail on how this process works?

See the relevant source code below:

// Main.cpp:

std::array<glm::mat4, 2> transform_matrix =
{
MVP.projection.matrix * glm::translate(glm::mat4(1.0F), glm::vec3(-0.8F, 0.0F, -3.0F)) // 0
* glm::rotate(glm::mat4(1.0F), glm::radians(24.0F), glm::vec3(1.0F, 0.0F, 0.0F)),

MVP.projection.matrix * glm::translate(glm::mat4(1.0F), glm::vec3(2.2F, 0.0F, -3.8F))  // 1
* glm::rotate(glm::mat4(1.0F), glm::radians(115.0F), glm::vec3(0.0F, 1.0F, 0.0F))
};

unsigned int transform_vbo;
glGenBuffers(1, &transform_vbo);
glBindBuffer(GL_ARRAY_BUFFER,transform_vbo);
glNamedBufferData(transform_vbo, sizeof(transform_matrix), &transform_matrix, GL_STATIC_DRAW);

//  positions (0) and colour (1) attributes already sent to OpenGL:
for (int i = 2; i < 6; i++)
{
glEnableVertexAttribArray(i);

glVertexAttribPointer(i, 4, GL_FLOAT, GL_FALSE, sizeof(glm::mat4),
(const void*)(sizeof(float) * ((i - 2)*4)));

glVertexAttribDivisor(i, 1);
}

Cube_VAO.Bind();

while (!glfw.WindowShouldClose())
{
glfw.ResizeWindow(MVP);
glDrawElementsInstanced(GL_TRIANGLES, CubeIndices.size(), GL_UNSIGNED_SHORT, 0, 2);
glfw.SwapBuffers();
}

#version 330 core

layout(location = 0) in vec4 position;
layout(location = 1) in vec4 colours;
layout(location = 2) in mat4 transform_matrix;

out vec4 colour_transfer;

void main()
{
gl_Position = transform_matrix * position;
colour_transfer = colours;
};

#version 330 core

in vec4 colour_transfer;
out vec4 colour;

void main()
{
colour = colour_transfer;
};

..

Edited by calioranged

##### Share on other sites
Posted (edited)

This is related to how the explicit uniform locations work as described here:

The key points here are that in general uniforms are of size 16 bytes. If you use smaller types then they are essentially padded (this is not an issue for uniforms, but as soon as you start working with constant buffers or shader storage buffers then this becomes a thing. For more information see std140 vs std430, for example here: https://www.khronos.org/opengl/wiki/Interface_Block_(GLSL) or read it from the khronos specs https://www.khronos.org/registry/OpenGL/specs/gl/).

This implies that your mat4 actually takes up 4 slots of vec4 and the compiler just marks location = 2 of type mat4 and marks location 3,4,5 as used so you can't use it again (you'll get a compile error as described in the Uniform article under "This also means that explicit uniform location ranges cannot overlap."). A side note here is that in OpenGL the shader reflection takes into account that you have the proper glUniform function for the data type in the shader. E.g. calling glUniform4fv is cannot be called on types in the shader of say vec2. Note that in other APIs, e.g. DX9-DX12 or Vulkan this is not so much an issue.

A mat4 is essentially vec4[4] or float[4][4] so its size is 4 * 4 * sizeof(float) which is 64 bytes. The stride of the 4 elements you pass in still needs to be this 64 bytes, because if you had laid out 2 matrices in memory, lets call them A and B then they would be stored as

Offset	Element
0	A[0]
16 	A[1]
32	A[2]
48	A[3]
64	B[0]
80	B[1]
96	B[2]
112	B[3]

And your glVertexAttribPointer offset from A[0] to B[0] is still 64 bytes, not 16 bytes as that would make the second instance index A[1] as opposed to B[0].

Quote

How does the shader know when to switch from the first transformation matrix (index 0 of transform_matrix array) to the second transformation matrix (index 1 of transform_matrix array) in order to render the two geometries in different places (as show in above image)?

This is controlled by the glVertexAttribDivisor, but is better illustrated with an example. Let say that you draw a single quad (2 triangles) with indices, 0, 1, 2, 0, 2, 3. Well with glDrawElementsInstanced(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, indices, 2) what basically happens is that the shader uses an instance index that increments once for every 'full set of indices'. So

Instance Index = 0, Vertex indices 0, 1, 2 - 0, 2, 3

Instance Index = 1, Vertex indices 0, 1, 2 - 0, 2, 3

It basically functions like the pseudo code outlined in: https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glDrawElementsInstanced.xhtml

glDrawElementsInstanced has the same effect as:

if (mode, count, or type is invalid )
generate appropriate error
else {
for (int i = 0; i < instancecount ; i++) {
instanceID = i;
glDrawElements(mode, count, type, indices);
}
instanceID = 0;
}

Where the shader uses the instanceID. But since you only set the glVertexAttribDivisor for the streams that have instance data, the position and colours stream will just repeat in place.

Quote

glVertexAttribDivisor modifies the rate at which generic vertex attributes advance when rendering multiple instances of primitives in a single draw call. If divisor is zero, the attribute at slot index advances once per vertex. If divisor is non-zero, the attribute advances once per divisor instances of the set(s) of vertices being rendered.

Lets say that you only had 1 triangle, instances with just position and matrix. then

Instance 0
vertex 0.position, 0.matrix
vertex 1.position, 0.matrix
vertex 2.position, 0.matrix

Instance 1
vertex 0.position, 1.matrix
vertex 1.position, 1.matrix
vertex 2.position, 1.matrix

This is because for position you don't explicitly set glVertexAttribDivisor (and hence increments for every index), but glVertexAttribDivisor is set to 1 for the matrix stream which means the index only increments for the count you passed in to glDrawElementsInstanced.

I hope this helps.

Clarification

##### Share on other sites
Posted (edited)

I hope this helps.

Thanks a lot for that clarification!

Edited by calioranged

• ### Game Developer Survey

We are looking for qualified game developers to participate in a 10-minute online survey. Qualified participants will be offered a \$15 incentive for your time and insights. Click here to start!

• 9
• 56
• 17
• 28