Jump to content

  • Log In with Google      Sign In   
  • Create Account

We're offering banner ads on our site from just $5!

1. Details HERE. 2. GDNet+ Subscriptions HERE. 3. Ad upload HERE.


#Actualmhagain

Posted 18 March 2013 - 01:15 PM

Unless you have the GL_ARB_vertex_attrib_64bit extension available you can safely assume that any GLdouble vertex input is going to be software emulated - in immediate mode by casting your glVertex3d parameters down to float, with VBOs by potentially running your entire per-vertex pipeline in software.  Obviously that would result in VBOs being somewhat slower than immediate mode code, which is exactly what you've observed.

 

Double-precision support in hardware is still relatively new (requiring SM5 hardware IIRC), and the moral of the story is that even if the GL spec allows it for a particular call, don't always assume that it means your hardware supports it (especially if it's from an older part of GL that pre-dates hardware T&L).

 

How can you go faster still?  If you're not already doing it, then consider interleaving your vertex struct.  I.e. instead of:

 

GLfloat positions[ARBITRARY_NUMBER];
GLfloat normals[ARBITRARY_NUMBER];
GLfloat texcoords[ARBITRARY_NUMBER];

 

Use:

 

struct myVertex
{
    GLfloat position[3];
    GLfloat normal[3];
    GLfloat texcoord[2];
};

myVertex vertices[ARBITRARY_NUMBER];

 

This can be a win as it means that your GPU can now read all data for each vertex together, and without having to do any random jumping around in memory.

 

Another thing you can do, since you have an older GPU that only supports GL2.1, is to use GL_UNSIGNED_SHORT instead of GL_UNSIGNED_INT for your GL_ELEMENT_ARRAY_BUFFER, if you can get away with it.  This is also coming back to the "don't always assume that hardware supports what the GL spec allows" point, and a GPU that old may not support 32-bit indices very well (or at all!)

 

Finally, and to cut down on buffer changes, you could consider packing all objects into a single VBO (rather than 150 * 2 separate VBOs) and using the parameters to your glDrawArrays/glDrawElements calls (or the offsets specified by your gl*Pointer calls) to select which object gets drawn.  This may or may not be compatible with using 16-bit indices however, so you'll need to profile both approaches and see which works best for you.


#1mhagain

Posted 18 March 2013 - 01:05 PM

Unless you have the GL_ARB_vertex_attrib_64bit extension available you can safely assume that any GLdouble vertex input is going to be software emulated - in immediate mode by casting your glVertex3d parameters down to float, with VBOs by potentially running your entire per-vertex pipeline in software.  Obviously that would result in VBOs being somewhat slower than immediate mode code, which is exactly what you've observed.

 

Double-precision support in hardware is still relatively new (requiring SM5 hardware IIRC), and the moral of the story is that even if the GL spec allows it for a particular call, don't always assume that it means your hardware supports it (especially if it's from an older part of GL that pre-dates hardware T&L).


PARTNERS