I need help understanding ARB_instanced_arrays.

Graphics and GPU Programming Programming

Started by MarkS_ January 03, 2013 06:26 PM

9 comments, last by MarkS_ 11 years, 3 months ago

3,509

Author

January 03, 2013 06:26 PM

I'm looking at the tutorial here: http://sol.gfxile.net/instancing.html It seems to be the only and/or best tutorial dealing with ARB_instanced_arrays. However, not much code is presented. I've asked the author, but he seems reluctant to release his code, so I'm hoping I can get an explanation.


int pos = glGetAttribLocation(shader_instancedarrays.program, "transformmatrix");
int pos1 = pos + 0; // <- What is going on here..
int pos2 = pos + 1; // ... and here...
int pos3 = pos + 2; // ... and here...
int pos4 = pos + 3; // ... and here?
glEnableVertexAttribArray(pos1);
glEnableVertexAttribArray(pos2);
glEnableVertexAttribArray(pos3);
glEnableVertexAttribArray(pos4);
glBindBuffer(GL_ARRAY_BUFFER, VBO_containing_matrices); // <- The part I really do not understand
glVertexAttribPointer(pos1, 4, GL_FLOAT, GL_FALSE, sizeof(GLfloat) * 4 * 4, (void*)(0));
glVertexAttribPointer(pos2, 4, GL_FLOAT, GL_FALSE, sizeof(GLfloat) * 4 * 4, (void*)(sizeof(float) * 4));
glVertexAttribPointer(pos3, 4, GL_FLOAT, GL_FALSE, sizeof(GLfloat) * 4 * 4, (void*)(sizeof(float) * 8));
glVertexAttribPointer(pos4, 4, GL_FLOAT, GL_FALSE, sizeof(GLfloat) * 4 * 4, (void*)(sizeof(float) * 12));
glVertexAttribDivisor(pos1, 1);
glVertexAttribDivisor(pos2, 1);
glVertexAttribDivisor(pos3, 1);
glVertexAttribDivisor(pos4, 1);

From what I can surmise, the code gets the location of the transformation matrix in the shader program. (Guessing here...) It then creates four positions, corresponding to four instances. Then a VBO is bound. I'm guessing that the VBO contains an array of vertices, each defined by four floats, grouped into four representing a 4x4 matrix, and each representing a position of the corresponding instance. So, the VBO in this example contains 16 vertices, with vertex 1-4 representing a matrix, 5-8 representing another matrix, etc.

Is my understanding correct? The tutorial really isn't a tutorial, but a description of the various methods to accomplish instancing, and thus, is missing important details.

If my understanding of the VBO is correct, it brings up another question. Can I assume that the matrices stored in the VBO are represented as:


|0 4  8 12| = vertex 1, 5, 9, etc.
|1 5  9 13| = vertex 2, 6, 10, etc.
|2 6 10 14| = vertex 3, 7, 11, etc.
|3 7 11 15| = vertex 4, 8, 12, etc.

I'm surprised that there is so little information available on instancing implementation.

MarkS_

3,509

Author

January 03, 2013 10:05 PM

From the lack of views, posts and Google results, I get the feeling this is a taboo topic... Like talking about God in the Lounge...

Yours3!f

1,534

January 04, 2013 12:08 AM

From the lack of views, posts and Google results, I get the feeling this is a taboo topic... Like talking about God in the Lounge...

I'm pretty sure it's not a taboo topic. The issue might be that this extension is designed to deliver good rendering performance, when rendering the same object like million times, with different transformation matrices. Therefore it is only used in huge projects, where it is needed (ie. games, and there aren't many OpenGL games).

I guess people who needed it got it right over time, and nobody made a decent tutorial.

Despite doing advanced stuff (mostly post-processing) I still haven't done this. Plus usually, when you get there to use this extension you already have a scene graph set up, etc.

But here's one of the extensions you may need:

http://www.opengl.org/registry/specs/EXT/draw_instanced.txt

a tutorial:

http://ogldev.atspace.co.uk/www/tutorial33/tutorial33.html

plus you have the docs and the specs. Read all these, and I think you'll get it right :)

Blog:

http://extremeistan.wordpress.com/

Stuff I wrote:

https://github.com/Yours3lf/libmymath

https://github.com/Yours3lf/linux_gl_fps

https://github.com/Yours3lf/instanced_font_rendering

http://youtu.be/k8PYkihyGXA

https://github.com/scrawl/smaa-opengl

https://github.com/Yours3lf/gl_browser_gui

Follow me on twitter:

https://twitter.com/0martint

MarkS_

3,509

Author

January 04, 2013 02:19 AM

From the lack of views, posts and Google results, I get the feeling this is a taboo topic... Like talking about God in the Lounge...

I'm pretty sure it's not a taboo topic. The issue might be that this extension is designed to deliver good rendering performance, when rendering the same object like million times, with different transformation matrices. Therefore it is only used in huge projects, where it is needed (ie. games, and there aren't many OpenGL games).

I guess people who needed it got it right over time, and nobody made a decent tutorial.

Despite doing advanced stuff (mostly post-processing) I still haven't done this. Plus usually, when you get there to use this extension you already have a scene graph set up, etc.

But here's one of the extensions you may need:

http://www.opengl.org/registry/specs/EXT/draw_instanced.txt

a tutorial:

http://ogldev.atspace.co.uk/www/tutorial33/tutorial33.html

plus you have the docs and the specs. Read all these, and I think you'll get it right

That was a joke. It seems to be a slow day around here and I was bored.

Anyway, I did find that tutorial you posted, but I forgot one key point: I want to stay away from GL 4.x for the time being. I'm working on something that I want to be able to port to GL ES and the Mac, and I've heard that Apple, for one, doesn't yet support 4.x and I'm not too sure of ES. The tutorial I linked to is based on 3.3. There seems to be significant differences between the two tutorials.

I main question is about how the matrices are encoded. I'm not clear on how that is done.

21st Century Moose

13,459

January 04, 2013 10:52 AM

First up, note that if you want to also support ES, you should be aware that instancing is only available in ES3, which may rule everything else out for you.

Now, presumably the original author of this specified the vertex attrib as a 4x4 matrix in their shader code. glVertexAttribPointer doesn't let you set up a 4x4 matrix with one call, you must use 4 instead (one for each row) so knowing that this matrix will take 4 consecutive attrib slots, we get the location of the attrib, which will also be the location of the first slot, then the other 3 slots follow on from it; hence pos + 1, + 2, + 3.

The stride param of the glVertexAttribPointer calls specifies 16 floats, so the VBO contains 16 floats per-instance (with the "1" in glVertexAttribDivisor specifying that each instance drawn advances the VBO by 1 - i.e. 16 floats). So this setup is for drawing an arbitrary number of instances (the exact number drawn will depend on the params to the glDraw*Instanced call) and specifying the transformation matrix to be applied to each instance, which is stored in the VBO.

Instancing, by the way, is also great for particle systems - faster than geometry shaders while still letting you send only one vert per particle.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

MarkS_

3,509

Author

January 04, 2013 06:15 PM

So, from what you said, that example only covers one instance?

MarkS_

3,509

Author

January 04, 2013 07:50 PM

OK, I still cannot wrap my head around this.

int pos = glGetAttribLocation(shader_instancedarrays.program, "transformmatrix");

int pos1 = pos + 0; // <- According to mhagain, this gets the first row of the matrix specified in "transformmatrix".

int pos2 = pos + 1; // ... second row...

int pos3 = pos + 2; // ... third row...

int pos4 = pos + 3; // ... fourth row

glEnableVertexAttribArray(pos1);

glEnableVertexAttribArray(pos2);

glEnableVertexAttribArray(pos3);

glEnableVertexAttribArray(pos4);

// I get the feeling that this line is the key. It is all important and it has the least amount of information available.

glBindBuffer(GL_ARRAY_BUFFER, VBO_containing_matrices); // <- Again, no idea how this is set up...

// Now I am really confused. If I understand what mhagain wrote, each of these specify one row of a matrix.

// BUT...

// Each is setting the stride to be 16 floats! Is this a typo? What is happening here?

glVertexAttribPointer(pos1, 4, GL_FLOAT, GL_FALSE, sizeof(GLfloat) * 4 * 4, (void*)(0));

glVertexAttribPointer(pos2, 4, GL_FLOAT, GL_FALSE, sizeof(GLfloat) * 4 * 4, (void*)(sizeof(float) * 4));

glVertexAttribPointer(pos3, 4, GL_FLOAT, GL_FALSE, sizeof(GLfloat) * 4 * 4, (void*)(sizeof(float) * 8));

glVertexAttribPointer(pos4, 4, GL_FLOAT, GL_FALSE, sizeof(GLfloat) * 4 * 4, (void*)(sizeof(float) * 12));

// This part I get...

glVertexAttribDivisor(pos1, 1);

glVertexAttribDivisor(pos2, 1);

glVertexAttribDivisor(pos3, 1);

glVertexAttribDivisor(pos4, 1);

I am thinking that the VBO does contain the list of matrices for the number of instances and that the rest of the code is merely setting up how to access that. What I need is the missing code. How is that VBO set up? Do the four glVertexAttribPointer's tell how to access each matrix instance in the VBO, or is it specifying how to access the first, and thus, would need to be repeated and adjusted for each instance? I get that the group of four specify each row in the matrix, but the stride is throwing me off, as is whether these four need to be repeated for each instance. The stride tells me no, but the pointer tells me yes.

So confusing! :wacko:

beans222

1,248

January 04, 2013 09:12 PM

The VBO is just a normal VBO. It just contains matrices. The glVertexAttribPointers setup the attribute for the shader. More than one are needed because GLSL implements matrix attributes as 4 vec4s with consecutive positions/IDs; the stride is a full 4x4 matrix. The glVertexAttribDivisor =1 tells GL that that attribute is per each instance draw (rather than per vertex =0).

After setting all that up, assuming your shader is set up to use the attribute for the world transform instead of a uniform, then you just call glDrawElementsInstanced which takes the number of instances as the last parameter that matches the number of matrices in your VBO.

New C/C++ Build Tool 'Stir' (doesn't just generate Makefiles, it does the build): https://github.com/space222/stir

Yours3!f

1,534

January 04, 2013 11:07 PM


GLuint vbo; //vertex buffer object containing the matrices
glGenBuffers( 1, &vbo ); //create the vbo

// I get the feeling that this line is the key. It is all important and it has the least amount of information available.
glBindBuffer(GL_ARRAY_BUFFER, vbo); //bind it

mat4 matrices[4];

/*
fill matrices here...
*/

//fill the vbo with data
glBufferData( GL_ARRAY_BUFFER, sizeof( float ) * 16 * 4, &matrices[0][0], GL_STATIC_DRAW );

//now that the matrices are on the GPU, lets go ahead and use them

//get the position of the vertex attribute to access the vbo
int pos = glGetAttribLocation(shader_instancedarrays.program, "transformmatrix");
int pos1 = pos + 0; // <- According to mhagain, this gets the first row of the matrix specified in "transformmatrix".
int pos2 = pos + 1; // ... second row...
int pos3 = pos + 2; // ... third row...
int pos4 = pos + 3; // ... fourth row
glEnableVertexAttribArray(pos1);
glEnableVertexAttribArray(pos2);
glEnableVertexAttribArray(pos3);
glEnableVertexAttribArray(pos4); 
// Now I am really confused. If I understand what mhagain wrote, each of these specify one row of a matrix.
// BUT...
// Each is setting the stride to be 16 floats! Is this a typo? What is happening here? 

//set up the strides so that each vertex attribute location will point to each of the matrices.
//ie. pos1 --> matrices[0]
//pos2 --> matrices[1] etc...
glVertexAttribPointer(pos1, 4, GL_FLOAT, GL_FALSE, sizeof(GLfloat) * 4 * 4, (void*)(0));
glVertexAttribPointer(pos2, 4, GL_FLOAT, GL_FALSE, sizeof(GLfloat) * 4 * 4, (void*)(sizeof(float) * 4));
glVertexAttribPointer(pos3, 4, GL_FLOAT, GL_FALSE, sizeof(GLfloat) * 4 * 4, (void*)(sizeof(float) * 8));
glVertexAttribPointer(pos4, 4, GL_FLOAT, GL_FALSE, sizeof(GLfloat) * 4 * 4, (void*)(sizeof(float) * 12)); 

// This part I get...
glVertexAttribDivisor(pos1, 1);
glVertexAttribDivisor(pos2, 1);
glVertexAttribDivisor(pos3, 1);
glVertexAttribDivisor(pos4, 1);

I think this should work... haven't tested it though.

EDIT:
stride
Specifies the byte offset between consecutive generic vertex attributes. If stride is 0, the generic vertex attributes are understood to be tightly packed in the array. The initial value is 0.
this tells the api how big each data pack (each matrix) is

pointer
Specifies an offset of the first component of the first generic vertex attribute in the array in the data store of the buffer currently bound to the GL_ARRAY_BUFFER target. The initial value is 0.
ie this tells the api where each matrix is located in the vbo

Blog:

http://extremeistan.wordpress.com/

Stuff I wrote:

https://github.com/Yours3lf/libmymath

https://github.com/Yours3lf/linux_gl_fps

https://github.com/Yours3lf/instanced_font_rendering

http://youtu.be/k8PYkihyGXA

https://github.com/scrawl/smaa-opengl

https://github.com/Yours3lf/gl_browser_gui

Follow me on twitter:

https://twitter.com/0martint

Yours3!f