Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 02 May 2013
Offline Last Active Yesterday, 11:32 AM

#5253424 Bad performance when rendering medium amount of meshes

Posted by Silverlan on 22 September 2015 - 05:27 AM

Well, I've run into another impasse.

I've decided to add the indices to the same buffer as the vertex data, so the structure of the global buffer now looks like this:



This works just fine.


However some meshes require additional vertex data aside from the positions, normals and uv coordinates. All vertices in the global buffer need to have the same structure, otherwise I run into problems when rendering shadows (Which skip the normal +uv data and don't need to know about the additional data (except in a few special cases)).


My initial idea was that I could keep the format of the global buffer (Positions, Normals, UV and Indices), and create a separate buffer for each mesh that requires additional data. This would result in more buffer changes during rendering, however since these type of meshes are a lot more uncommon than regular meshes, it wouldn't be a problem.


So, basically all regular vertex data is still stored in the global buffer.

All meshes with additional data have an additional buffer, which contains said data.


This is fine in theory, however the last parameter of "glDrawElementsBaseVertex" basically makes that impossible from what I can tell.

I'd need the basevertex to only affect the global buffer, but not the additional buffer (Because the additional buffer only contains data for the mesh that is currently being rendered). Is that in any way possible?


If not, what are my options?

Do I have to separate these types of meshes from the global buffer altogether, and just use my old method?

#5252367 Bad performance when rendering medium amount of meshes

Posted by Silverlan on 15 September 2015 - 10:15 AM

Thank you, but I'm still unclear on a couple of things.


I've switched the data order to:



But what about the indices? Is it not possible to just append them to the same buffer (i.e. V1|N1|UV1|V2|N2|UV2|V3|N3|UV3|I1|I2|I3|I4|I5|I6), or is an element buffer absolutely required?


Either way, I've created a test-scenario with just one object and no vao.

There are two buffers, the vbo with the data as described above, and the element buffer with the vertex indices.


During rendering I then use:

glBindBuffer(GL_ARRAY_BUFFER,dataBuffer) // vbo
// Vertex Data
	3, // 3 Floats
	sizeof(float) *5, // Offset between vertices is sizeof(normal) +sizeof(uv)
	(void*)0 // First vertex starts at the beginning

// Normal Data
	3, // 3 Floats
	sizeof(float) *5, // Offset between normals is sizeof(uv) +sizeof(vertex)
	(void*)(sizeof(float) *3) // First normal starts after first vertex

// UV Data
	2, // 2 Floats
	sizeof(float) *6, // Offset between uvs is sizeof(vertex) +sizeof(normal)
	(void*)(sizeof(float) *6) // First uv starts after first normal
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER,indexBuffer); // index/element buffer
    (void*)0, // For testing purposes; Index buffer contains only one mesh, which starts at index 0
    0 // Not sure about this one? VBO vertex #0 is located at position 0 in the data buffer

(I know this isn't effective code, I'm doing it this way to help me understand. I'll optimize it once I got it working)


The mesh is rendered, however not correctly (Vertices, normals and uv coordinates are wrong).

#5252345 Bad performance when rendering medium amount of meshes

Posted by Silverlan on 15 September 2015 - 06:01 AM

You don't need one buffer per attribute, you can put them all in the same buffer (either interleaved or separate).

Hm... I don't think I understand how that's supposed to work.

So, I create a single buffer, and push all of my vertex, normal, uv and index data into that buffer:


V = Vertex

N = Normal

I = Index

|x| = 4 Bytes

Buffer Data: ...|V1|V1|V1|V2|V2|V2|V3|V3|V3|V4|V4|V4|N1|N1|N1|N2|N2|N2|N3|N3|N3|N4|N4|N4|UV1|UV1|UV2|UV2|UV3|UV3|UV4|UV4|I1|I2|I3|I4|I5|I6|...


Then, during rendering, I can use glDrawElementsBaseVertex to point it to the first index (I1) and draw the mesh:

offsetToFirstIndex = grabOffset()



But what about the normals and uv coordinates? I'd still have to use glVertexAttribPointer for both to specify their respective offsets, which means I'd still need a VAO for each mesh.


What am I missing?

#5252330 Bad performance when rendering medium amount of meshes

Posted by Silverlan on 15 September 2015 - 04:43 AM

That way you can just pack all your static meshes in one big buffer, managing the offsets yourself (which is fun tongue.png) and have only a couple VAO switches. Since you're essentially doing memory management there, you need to have in mind things like memory fragmentation (ie, what happens if you pack 500 meshes then remove 200 randomly from the same buffer, things get fragmented), so beware.


So, basically I need 3 "global" buffers (1 for vertices, 1 for normals, 1 for uv coordinates), then pack all static (Why just static? My dynamic meshes have the same format, can't I just include them as well?) mesh data in those three. During rendering I then just bind these three buffers once at the beginning (=1 vao switch) and use glDrawElementsBaseVertex for each mesh with the appropriate offset.

Is that about right?




How are you measuring time? I'm guessing that's total CPU per frame?

No, it's just the time for the render loop (The pseudo code). I've used std::chrono::high_resolution_clock to measure it, so it's just the CPU time. I'll give ARB_timer_query a try.

According to the profiler "Very Sleepy", the main CPU bottleneck is with "DrvPresentBuffers". I'm not sure if that means it's the GPU itself, or the synchronization/data transfer from CPU to GPU.


If your problem is that your GPU time per frame is the bottleneck, then you'll have to optimize your shaders / data formats / overdraw / etc.
If you problem is your CPU e per frame is the bottleneck, then it's a more traditional optimization problem. Measure your CPU-side code to see where the time is going.

I'm pretty sure the shader isn't the problem, the fps stay the same even if I simply discard all fragments and deactivate the vertex shader.

Changing the resolution also changes nothing (I've tried switching between 640x480 and 1920x1080, fps is the same), so I think I can also throw out overdraw as a possible candidate?

#5130598 GPU Gems 3 - Samples and source code?

Posted by Silverlan on 11 February 2014 - 01:34 PM

The book is available for free on the nvidia website. A lot of the chapters are referring to samples and source code on the DVD which is supposed to be accompanying it, however I can't for the life of me find a download for that.


The book, including the DVD, is available for purchase, but the price is ludicrous (466 Euro (That's not a typo) on the german amazon). The kindle version, which is a whole lot cheaper, does not include the DVD, so I'm somewhat stumped.


Maybe I'm just blind, does anyone know if the DVD content is available for download on the nvidia website as well?

If not, does anyone know a place where it can be purchased for a reasonable price, within germany?


The reason I need the DVD content is because a lot of the articles are somewhat difficult to follow without the source code at hand.