Sign in to follow this  
jmfurlott

OpenGL Creating new triangles using GL_TRIANGLE_STRIPS

Recommended Posts

jmfurlott    105

Hello all,


Trying to render a model using GL_TRIANGLE_STRIPS and its working fine, however, at certain arbitrary points I want to break off the the triangle strips and essentially start a new set of triangle strips (that are completely separate from the original strips). How should I go about doing this? Multiple VBOs? I currently have only one VBO that holds all my vertices.

 

(OpenGL ES2.0 in particular but surely applies to OpenGL as a whole)

 


Thanks!

Share this post


Link to post
Share on other sites
C0lumbo    4411

You can do it with multiple draw calls just starting each at a different offset into your index buffer, no need to use multiple VBOs at all, but it's usually much more efficient to use degenerate triangles to 'stitch' strips together. For illustrative purposes (your strips are probably longer) imagine two quads and the following triangle strip indices to render them: 0, 1, 2, 3, 3, 4, 4, 5, 6, 7.

 

That's 10 indices, so 8 triangles. But triangle 2, 3, 3, triangle 3, 3, 4, triangle 3, 4, 4 and triangle 4, 4, 5 are degenerate, that is they have zero area, so they do not contribute to the scene at all. So only 4 triangles get rendered, just the two quads we wanted.

Share this post


Link to post
Share on other sites
L. Spiro    25619

You may want to also test if triangle strips are really best for you in the first place.

In my experience, starting with triangle lists and then adding triangle stripping later the performance actually decreased.  At my office we got the same results (our triangle-strippers are not related and use different algorithms).

Neither of us did a thorough investigation into why our triangle strips were slower than triangle lists, but the candidates are obvious: poor caching and extra triangles.

 

Degenerate triangles get culled early, but not early enough.  You take a hit for them.

And while there is less bandwidth due to smaller index buffers, it is not enough to make up for the better cache performance offered by triangle lists.

 

As a general rule of thumb, you should only use triangle strips when the index buffer’s size is below 60% of its triangle-list form.  When the index buffer is able to decrease by this much it means fewer degenerate triangles were generated and bandwidth is so significantly reduced that it can make up for the poorer cache.

 

Also, your mobile device uses a unified memory model, which means bandwidth is irrelevant, which means that by using triangle strips you only increase the triangle count and decrease the cache performance.  The only way in which triangle strips help you is in decreasing bandwidth, but on mobile devices there is no bandwidth issue in the first place, which means triangle strips are really just a way of shooting yourself in the foot.  You get all of the bad and none of the good.

 

Never blindly go with what is rumored to be the better way.  Always do your own testing.  You will likely find that you will have better performance with triangle lists than with triangle strips.

 

 

L. Spiro

Edited by L. Spiro

Share this post


Link to post
Share on other sites
mhagain    13430

Depending on the hardware being targetted, strips may yet be the best choice; mobile hardware (this is GL ES after all) is particularly known for flying in the face of what works better elsewhere.  If you do want to retain strip order, then using indexing will let you do that and will nicely cover the case where you need to join two strips; this will be cheaper than adding extra verts to make degenerates as indices are smaller than vertices.  Indexing can also cover cases where you need to add free triangles into your mesh, or even add in some fans, and all without any messing and with just one draw call per mesh.

 

Primitive restart can also do that, but since you're on ES2 you don't have primitive restart available, so do at least try a benchmark with just indexed triangles as in general terms (and as L Spiro says) they are the general-case fastest path nowadays; if you run into performance problems with those (or if you are on a class of mobile hardware where you know for absolute certain that strips are preferred) then is the time to start considering strips, not before.

 

Essential reading here: http://hacksoflife.blogspot.ie/2010/01/to-strip-or-not-to-strip.html

Share this post


Link to post
Share on other sites
jmfurlott    105

You may want to also test if triangle strips are really best for you in the first place.

 

 

Also, your mobile device uses a unified memory model, which means bandwidth is irrelevant, which means that by using triangle strips you only increase the triangle count and decrease the cache performance.  The only way in which triangle strips help you is in decreasing bandwidth, but on mobile devices there is no bandwidth issue in the first place, which means triangle strips are really just a way of shooting yourself in the foot.  You get all of the bad and none of the good.

 

Whoa that is incredible. Thank you so much. I will first construct it using triangles.  I was just getting such terrible performance using strips and this could be way.  I will reconstruct my data using just standard GL_TRIANGLES.  

 

Do you have any more information about this bandwidth issue?

 

Thank you!

 

-jmfurlott

Share this post


Link to post
Share on other sites
jmfurlott    105

Depending on the hardware being targetted, strips may yet be the best choice; mobile hardware (this is GL ES after all) is particularly known for flying in the face of what works better elsewhere.  If you do want to retain strip order, then using indexing will let you do that and will nicely cover the case where you need to join two strips; this will be cheaper than adding extra verts to make degenerates as indices are smaller than vertices.  Indexing can also cover cases where you need to add free triangles into your mesh, or even add in some fans, and all without any messing and with just one draw call per mesh.

 

Primitive restart can also do that, but since you're on ES2 you don't have primitive restart available, so do at least try a benchmark with just indexed triangles as in general terms (and as L Spiro says) they are the general-case fastest path nowadays; if you run into performance problems with those (or if you are on a class of mobile hardware where you know for absolute certain that strips are preferred) then is the time to start considering strips, not before.

 

Essential reading here: http://hacksoflife.blogspot.ie/2010/01/to-strip-or-not-to-strip.html

 

 

Depending on the hardware being targetted, strips may yet be the best choice; mobile hardware (this is GL ES after all) is particularly known for flying in the face of what works better elsewhere.  If you do want to retain strip order, then using indexing will let you do that and will nicely cover the case where you need to join two strips; this will be cheaper than adding extra verts to make degenerates as indices are smaller than vertices.  Indexing can also cover cases where you need to add free triangles into your mesh, or even add in some fans, and all without any messing and with just one draw call per mesh.

 

Primitive restart can also do that, but since you're on ES2 you don't have primitive restart available, so do at least try a benchmark with just indexed triangles as in general terms (and as L Spiro says) they are the general-case fastest path nowadays; if you run into performance problems with those (or if you are on a class of mobile hardware where you know for absolute certain that strips are preferred) then is the time to start considering strips, not before.

 

Essential reading here: http://hacksoflife.blogspot.ie/2010/01/to-strip-or-not-to-strip.html

 

Ah my question was already answered (didn't see this before replying).  Thank you guys...seems like normal triangles are the way to go for now.  I don't indices are good for my model because (based on input) they are continually changing.

Share this post


Link to post
Share on other sites
L. Spiro    25619

Do you have any more information about this bandwidth issue?

Nothing I could cite but you can read about unified memory models (UMM) in general online.
What it means for mobile devices is that the GPU and CPU share the same memory, unlike in desktops where they each have their own memory.
When a GPU has its own memory it can only access that memory, so whatever you want to draw has to, at some point, be transferred across the bus to the GPU RAM from the CPU RAM. How much and how fast you can transfer is “bandwidth”.

So for desktops your index and vertex buffers have to be copied, thus smaller is better.

For UMM, no copy has to take place since the GPU can access the vertex/index buffers directly wherever they are in “normal” RAM.
Smaller is still better, but not as significantly.
And there are still things that can cause a copy to take place by the driver (though it is just “normal” RAM to “normal” RAM, literally via memcpy()).
If you are not using a VBO, the entire vertex buffer will be copied.
If you are not using an IBO, the entire index buffer will be copied.
If your vertex-buffer elements are poorly aligned (for example using 6-bytes for positions) the entire vertex buffer will be copied, and slowly since it also realigns the vertex data.


L. Spiro Edited by L. Spiro

Share this post


Link to post
Share on other sites
SuperVGA    1132

Hello all,


Trying to render a model using GL_TRIANGLE_STRIPS and its working fine, however, at certain arbitrary points I want to break off the the triangle strips and essentially start a new set of triangle strips (that are completely separate from the original strips). How should I go about doing this? Multiple VBOs? I currently have only one VBO that holds all my vertices.

 

(OpenGL ES2.0 in particular but surely applies to OpenGL as a whole)

 


Thanks!

Under some circumstances, you can do the trick where you alpha out at the end of one strip and alpha back in at the start of the next.

(Still in the same draw call, -it's the same GL_TRIANGLE_STRIP, basically.)


I realize this seems very dirty, but I made it work, -it's easy and performs well when used with caution.

 

Make sure the triangles fading to transparent and the one connecting strips have zero area too, -and preferably hide them by applying high (enough) depth values, there's no need to process more fragments than necessary.

 

I used it for patches in a typical heightmap terrain, and there are probably many places where it won't perform well.

Edited by SuperVGA

Share this post


Link to post
Share on other sites
L. Spiro    25619

I don't indices are good for my model because (based on input) they are continually changing.

I didn’t see this before but now that I have I also recall you saying your performance was lower than you expected.

If you don’t update your VBO’s properly it will have a very huge impact on your performance (literally halving it).
Be sure you are updating your VBO’s properly either by using ring buffers or by orphaning your buffers before changing them.


L. Spiro

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Similar Content

    • By Kjell Andersson
      I'm trying to get some legacy OpenGL code to run with a shader pipeline,
      The legacy code uses glVertexPointer(), glColorPointer(), glNormalPointer() and glTexCoordPointer() to supply the vertex information.
      I know that it should be using setVertexAttribPointer() etc to clearly define the layout but that is not an option right now since the legacy code can't be modified to that extent.
      I've got a version 330 vertex shader to somewhat work:
      #version 330 uniform mat4 osg_ModelViewProjectionMatrix; uniform mat4 osg_ModelViewMatrix; layout(location = 0) in vec4 Vertex; layout(location = 2) in vec4 Normal; // Velocity layout(location = 3) in vec3 TexCoord; // TODO: is this the right layout location? out VertexData { vec4 color; vec3 velocity; float size; } VertexOut; void main(void) { vec4 p0 = Vertex; vec4 p1 = Vertex + vec4(Normal.x, Normal.y, Normal.z, 0.0f); vec3 velocity = (osg_ModelViewProjectionMatrix * p1 - osg_ModelViewProjectionMatrix * p0).xyz; VertexOut.velocity = velocity; VertexOut.size = TexCoord.y; gl_Position = osg_ModelViewMatrix * Vertex; } What works is the Vertex and Normal information that the legacy C++ OpenGL code seem to provide in layout location 0 and 2. This is fine.
      What I'm not getting to work is the TexCoord information that is supplied by a glTexCoordPointer() call in C++.
      Question:
      What layout location is the old standard pipeline using for glTexCoordPointer()? Or is this undefined?
       
      Side note: I'm trying to get an OpenSceneGraph 3.4.0 particle system to use custom vertex, geometry and fragment shaders for rendering the particles.
    • By markshaw001
      Hi i am new to this forum  i wanted to ask for help from all of you i want to generate real time terrain using a 32 bit heightmap i am good at c++ and have started learning Opengl as i am very interested in making landscapes in opengl i have looked around the internet for help about this topic but i am not getting the hang of the concepts and what they are doing can some here suggests me some good resources for making terrain engine please for example like tutorials,books etc so that i can understand the whole concept of terrain generation.
       
    • By KarimIO
      Hey guys. I'm trying to get my application to work on my Nvidia GTX 970 desktop. It currently works on my Intel HD 3000 laptop, but on the desktop, every bind textures specifically from framebuffers, I get half a second of lag. This is done 4 times as I have three RGBA textures and one depth 32F buffer. I tried to use debugging software for the first time - RenderDoc only shows SwapBuffers() and no OGL calls, while Nvidia Nsight crashes upon execution, so neither are helpful. Without binding it runs regularly. This does not happen with non-framebuffer binds.
      GLFramebuffer::GLFramebuffer(FramebufferCreateInfo createInfo) { glGenFramebuffers(1, &fbo); glBindFramebuffer(GL_FRAMEBUFFER, fbo); textures = new GLuint[createInfo.numColorTargets]; glGenTextures(createInfo.numColorTargets, textures); GLenum *DrawBuffers = new GLenum[createInfo.numColorTargets]; for (uint32_t i = 0; i < createInfo.numColorTargets; i++) { glBindTexture(GL_TEXTURE_2D, textures[i]); GLint internalFormat; GLenum format; TranslateFormats(createInfo.colorFormats[i], format, internalFormat); // returns GL_RGBA and GL_RGBA glTexImage2D(GL_TEXTURE_2D, 0, internalFormat, createInfo.width, createInfo.height, 0, format, GL_FLOAT, 0); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST); DrawBuffers[i] = GL_COLOR_ATTACHMENT0 + i; glBindTexture(GL_TEXTURE_2D, 0); glFramebufferTexture(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0 + i, textures[i], 0); } if (createInfo.depthFormat != FORMAT_DEPTH_NONE) { GLenum depthFormat; switch (createInfo.depthFormat) { case FORMAT_DEPTH_16: depthFormat = GL_DEPTH_COMPONENT16; break; case FORMAT_DEPTH_24: depthFormat = GL_DEPTH_COMPONENT24; break; case FORMAT_DEPTH_32: depthFormat = GL_DEPTH_COMPONENT32; break; case FORMAT_DEPTH_24_STENCIL_8: depthFormat = GL_DEPTH24_STENCIL8; break; case FORMAT_DEPTH_32_STENCIL_8: depthFormat = GL_DEPTH32F_STENCIL8; break; } glGenTextures(1, &depthrenderbuffer); glBindTexture(GL_TEXTURE_2D, depthrenderbuffer); glTexImage2D(GL_TEXTURE_2D, 0, depthFormat, createInfo.width, createInfo.height, 0, GL_DEPTH_COMPONENT, GL_FLOAT, 0); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST); glBindTexture(GL_TEXTURE_2D, 0); glFramebufferTexture(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, depthrenderbuffer, 0); } if (createInfo.numColorTargets > 0) glDrawBuffers(createInfo.numColorTargets, DrawBuffers); else glDrawBuffer(GL_NONE); if (glCheckFramebufferStatus(GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE) std::cout << "Framebuffer Incomplete\n"; glBindFramebuffer(GL_FRAMEBUFFER, 0); width = createInfo.width; height = createInfo.height; } // ... // FBO Creation FramebufferCreateInfo gbufferCI; gbufferCI.colorFormats = gbufferCFs.data(); gbufferCI.depthFormat = FORMAT_DEPTH_32; gbufferCI.numColorTargets = gbufferCFs.size(); gbufferCI.width = engine.settings.resolutionX; gbufferCI.height = engine.settings.resolutionY; gbufferCI.renderPass = nullptr; gbuffer = graphicsWrapper->CreateFramebuffer(gbufferCI); // Bind glBindFramebuffer(GL_DRAW_FRAMEBUFFER, fbo); // Draw here... // Bind to textures glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, textures[0]); glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, textures[1]); glActiveTexture(GL_TEXTURE2); glBindTexture(GL_TEXTURE_2D, textures[2]); glActiveTexture(GL_TEXTURE3); glBindTexture(GL_TEXTURE_2D, depthrenderbuffer); Here is an extract of my code. I can't think of anything else to include. I've really been butting my head into a wall trying to think of a reason but I can think of none and all my research yields nothing. Thanks in advance!
    • By Adrianensis
      Hi everyone, I've shared my 2D Game Engine source code. It's the result of 4 years working on it (and I still continue improving features ) and I want to share with the community. You can see some videos on youtube and some demo gifs on my twitter account.
      This Engine has been developed as End-of-Degree Project and it is coded in Javascript, WebGL and GLSL. The engine is written from scratch.
      This is not a professional engine but it's for learning purposes, so anyone can review the code an learn basis about graphics, physics or game engine architecture. Source code on this GitHub repository.
      I'm available for a good conversation about Game Engine / Graphics Programming
    • By C0dR
      I would like to introduce the first version of my physically based camera rendering library, written in C++, called PhysiCam.
      Physicam is an open source OpenGL C++ library, which provides physically based camera rendering and parameters. It is based on OpenGL and designed to be used as either static library or dynamic library and can be integrated in existing applications.
       
      The following features are implemented:
      Physically based sensor and focal length calculation Autoexposure Manual exposure Lense distortion Bloom (influenced by ISO, Shutter Speed, Sensor type etc.) Bokeh (influenced by Aperture, Sensor type and focal length) Tonemapping  
      You can find the repository at https://github.com/0x2A/physicam
       
      I would be happy about feedback, suggestions or contributions.

  • Popular Now