GL 3.3 glDrawElements VAO issue...AMD bug or my mistake?

Started by
2 comments, last by Kaptein 9 years, 8 months ago

Hello,

I'm running out of ideas trying to debug an issue with a basic line render in the form of a 'world axis' model.

The idea is simple:

I create a VAO with float line vertices (3 per vertex), int indices (1 per vertex), and unsigned byte color (3 per vertex)

I allow room and pack the array such that the first 12 vertices/indices/colors are for uniquely colored lines representing my +- world axis, and then a bunch of lines forming a 2D grid across the XZ plane.

Once data is loaded, I render by binding my VAO, activating a basic shader then drawing the model in two stages. One glDrawElements call is made for the axis lines after glLineWidth is set to 2, and the grid lines drawn through a separate glDrawElements with thinner lines.

Whenever I Draw this way, the last 6 lines of my grid (i.e. the end of the VAO array) show up as random colors. However, the lines themselves are correctly positioned, etc. If I just do one glDrawElements call for all lines (ie world axis and grid lines at once), then the entire model appears as expected with correct colors everywhere.

This is only an issue on some ATI cards (ie radeon mobility 5650), but works on NVidia no problem.

I can't see what I would have done wrong if the lines are appearing fine (ie my VAO index counts/offsets must be ok for glDrawElements), and I don't see how it could be that I'm somehow packing the data into the VAO wrong if they appear correctly via a single glDrawElements call instead of two calls separated by changes to glLineWidth()?

Any suggestions? glGetError, etc return no problems at all...

Here is some example render code, although I know it is just a small piece of the overall picture. This causes the problem:


    TFloGLSL_IndexArray *tmpIndexArray = m_VAOAttribs->GetIndexArray();

    //The first AXIS_MODEL_BASE_INDEX_COUNT elements are for the base axis..draw these thicker
    glLineWidth(2.0f);
    glDrawElements(GL_LINES, AXIS_MODEL_BASE_INDEX_COUNT, 
                   GL_UNSIGNED_INT,  (GLvoid*)tmpIndexArray->m_VidBuffRef.m_GLBufferStartByte);

    //The first remaining elements are for the base grid..draw these thin
    int gridLinesElementCount = m_VAOAttribs->GetIndexCount() - AXIS_MODEL_BASE_INDEX_COUNT;
    if(gridLinesElementCount > 0)
    {
        glLineWidth(1.0f);

        glDrawElements(GL_LINES, gridLinesElementCount, GL_UNSIGNED_INT, 
                       (GLvoid*)(tmpIndexArray->m_VidBuffRef.m_GLBufferStartByte + (AXIS_MODEL_BASE_INDEX_COUNT * sizeof(int))));
    }

This works just fine:


    glDrawElements(GL_LINES, m_VAOAttribs->GetIndexCount(), GL_UNSIGNED_INT, 
                    (GLvoid*)tmpIndexArray->m_VidBuffRef.m_GLBufferStartByte);
Advertisement
Update:
I still haven't figured out the root of the issue, but as a test I have switched to using floats for my color attribute instead of gl_unsigned_byte. My unsigned byte colors were being passed in the range 0..255 with normalized set to GL_TRUE, and floats are passed 0..1.0 with normalized param of GL_FALSE. Without really changing anything else , the problem goes away completely, so I am really suspicious of the ATI driver...
Anyone else seeing issues using multiple glDrawElement calls from a single bound VAO containing unsigned-byte color vertex attributes?

Update #2: Problem solved!

For anyone encountering similar issues, it turns out some older ATI cards (maybe newer) do NOT like vertex array attributes that are not aligned to 4-byte boundaries. I changed my color array to pass 4 unsigned bytes instead of 3, and updated my shader to accept vec4 instead of vec3 for that attribute and everything now works as intended.

Kind of a silly issue....but that is what i get for trying to cut corners on memory bandwidth, etc. =P

Kind of a silly issue....but that is what i get for trying to cut corners on memory bandwidth, etc. =P

No need to cut corners there unless you have actual bandwidth issues. I think even then it's suspect, especially since 3-bytes must be aligned to 4 even on modern, because there they just turn into extra instructions.

On modern cards misaligned accesses have minor impact, so packing the data more makes more sense there, I guess.

Very strange that it would just crash though, did you pack the struct so it didn't naturally align to a 4byte boundary? You can only align 1, 2, 4, 8 etc, btw.

While we're talking about performance: The only thing that really matters is that you don't split up your data into several structures. Like position + normals. The GPU will drop in speed exponentially.

See effects of strided access here: http://www.ncsa.illinois.edu/People/kindr/projects/hpca/files/singapore_p3.pdf

As long as the structs are packed, and the data can be read sequentially in memory, you are good.

This topic is closed to new replies.

Advertisement