Sign in to follow this  

OpenGL Optimization of many small glDrawElements() calls

This topic is 1648 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

So I realize there are many better ways to do this in modern OpenGL. But here's what I have: 

class World
....
function render() {
   foreach(chunk)
        if (chunk.isReady && chunk.isVisible) {
              chunk.render()
        }
}

class WorldChunk
public int[][][] blocks;
...
function render() {
    if (DISPLAY_LIST != 0) {
         glCallList(DISPLAY_LIST);
    } else {
         buildDisplayList();
    }
}
function buildDisplayList() {
        int list = GL11.glGenLists(1);   
        GL11.glNewList(list, GL11.GL_COMPILE);    
        pushVertexData();
        GL11.glEndList();  
}
function pushVertexData() {
        glPushMatrix();
        glTranslatef(x,y,z);

        glBindBuffer(GL_ARRAY_BUFFER, Block.vboVertexHandle);  //From a single static vbo stored in Block class - VBO built once on startup
        glVertexPointer(3, GL_FLOAT, 0, 0L);
        glBindBuffer(GL_ARRAY_BUFFER, Block.vboNormalHandle);  //From a single static vbo stored in Block class - VBO built once on startup
        glNormalPointer(GL_FLOAT, 0, 0L);
        
        GL11.glEnableClientState(GL11.GL_VERTEX_ARRAY);
        GL11.glEnableClientState(GL11.GL_NORMAL_ARRAY);

        for (int i = 0; i < sizeX; i++) {
            for (int j = 0; j < sizeY; j++) {
                for (int k = 0; k < sizeZ; k++) {
                    //Determine exposed faces..
                    EXPOSED_FACES = determineExposedFaces(i,j,k);
                    //Contains boolean array [true, true, true, true, true, true] of faces to draw if they are not hidden
                    //
                    Block.render(i,j,k, EXPOSED_FACES);
                }
            }
        }
        glDisableClientState(GL_NORMAL_ARRAY);
        glDisableClientState(GL_VERTEX_ARRAY);
        glPopMatrix();
}


class Block
...
function render(int x, int y, int z, int type, boolean[] faces) {
if (faces[0] || faces[1] || faces[2] || faces[3] || faces[4] || faces[5]) {
            glPushMatrix();                                        
            glTranslatef(x + size, y + size, z + size);     
            if (faces[0]) {
                glDrawElements(GL_TRIANGLES, frontIndicies);
            }
            if (faces[1]) {
                glDrawElements(GL_TRIANGLES, rightIndicies);
            }
            if (faces[2]) {
                glDrawElements(GL_TRIANGLES, topIndicies);
            }
            if (faces[3]) {
                glDrawElements(GL_TRIANGLES, leftIndicies);
            }
            if (faces[4]) {
                glDrawElements(GL_TRIANGLES, bottomIndicies);
            }
            if (faces[5]) {
                glDrawElements(GL_TRIANGLES, backIndicies);
            }

            glPopMatrix();
        }
}
 

As you can see, I'm basically just storing a single VBO of the cube and using indicies to decide which faces to actually draw. The problem is that I know this is not an optimal way to do things. I am currently:

 

1) Rendering only those chunks within +/- X, +/- Y units of the camera

2) Doing efficient frustum culling to decide which chunks to are outside the viewing area

3) Using display lists to "bake" a chunk's geometry

4) Using a single VBO for all cubes

 

In any given scene I might have only 20k-40k faces being drawn in actual on-GPU geometry, but even this is fairly slow. Even on a GTX 690, I can only sustain about 80fps with about 50k faces on frame.

 

I know I should switch to shaders and do this the true modern way, but is there a fundamental concept I'm abusing and causing such poor performance? I know that performance is not substantially consumed elsewhere in the code based on profiling the app. For example, I hold at 1650 fps if I just comment out each of the glDrawElements() calls above so nothing is drawn. I must be overwhelming the card with poorly scheduled draw calls on tiny vertex arrays...

Edited by voodoodrul

Share this post


Link to post
Share on other sites

Making too many draw calls is not a good thing, it would be much better to create a single vertexbuffer from all of those cubes. Maybe use an update function to update the all the needed visible faces, just the same way as you're drawing them now, but instead you could push them in a vertexbuffer. Then you could draw the whole vertexbuffer in your render function with just one draw call.

Share this post


Link to post
Share on other sites

Making too many draw calls is not a good thing, it would be much better to create a single vertexbuffer from all of those cubes. Maybe use an update function to update the all the needed visible faces, just the same way as you're drawing them now, but instead you could push them in a vertexbuffer. Then you could draw the whole vertexbuffer in your render function with just one draw call.

Thanks Sponji. I guess I knew this, but I had read that it can be less efficient to draw megalithic draws (entire scene graphs for example) vs smaller drawing calls. What you suggest makes all kinds of sense. 

 

I will revamp it to push the geometry into a buffer and render each chunk in one go. What do you suggest for pushing arbitrary amounts of data into the buffer? A dynamic array and just keep pushing the ordered verticies one-at-a-time? A fixed size FloatBuffer of the "worst-case" chunk size and just populate what you need? Or is there a nicer way to push vertex data as you see it onto a vertex buffer? It seems the buffers are fixed size. 

Share this post


Link to post
Share on other sites

I'm not really sure which way would be the best, but usually for that kind of chunks I've done it just by uploading the whole buffer again and it has been working quite nicely. Of course it would be good to spend some time profiling those different methods, at least if it becomes a problem.

Share this post


Link to post
Share on other sites

So I have started the process of moving all my rendering out to a single VBO on WorldChunk. 

 

However, I'm pretty baffled as to how I can push all the vertex data into a single FloatBuffer. It's a pretty basic question about OpenGL, but maybe if we talk through it it will make more sense. 

 

The way I was rendering my scene before relied on glTranslatef() to do all the heavy lifting to get my vertex data in position. The World.render() would iterate over the WorldChunks and call render() on those. WorldChunk.render() would glTranslatef(x,y,z) into the chunk's position. Then WorldChunks.render() would iterate over the Blocks and in turn each block would glTranslatef(x,y,z) to get the block into place. The nice thing about this is that I could draw faces without worrying about vertex data from an earlier face draw. 

 

Now I need to replicate the same process on raw vertex data stored in a buffer. What I do now is let WorldChunk.generate() occur in its own thread - not using any OpenGL calls during this compile pass. The function needs to draw only those faces that are exposed and visible. I do this by determining which faces are exposed and pushing *only those faces* onto an ArrayList<FloatBuffer>. Once the list is done, I push all those individual FloatBuffers onto a single new FloatBuffer vbuffer. 

 

WorldChunk

--------------------

private void generate() {

     .....

     buildMesh();

}

 

 

public void buildMesh() {
        this.dynamicVertexData = new ArrayList<FloatBuffer>();

        for (int i = 0; i < sizeX; i++) {
            for (int j = 0; j < sizeY; j++) {
                for (int k = 0; k < sizeZ; k++) {
                    if (blocks[i][j][k] != 0) {
                            Block.render(i, j, k, 1, EXPOSED_FACES, wireframe, this);  //Writes each face as its own FloatBuffer in dynamicVertexData
                    }
                }
            }
        }

        this.vbuffer = Util.getFloatBuffer(this.dynamicVertexData.size() * 18);
        //Convert all the individual floatbuffers into one megalithic float buffer
        for (int i = 0; i < this.dynamicVertexData.size(); i++) {
            if (this.dynamicVertexData.size() > 0) {
                this.vbuffer.put(this.dynamicVertexData.get(i));
            }
        }
        this.vbuffer.flip();
}


    private void buildVBO() {
        vboVertexHandle = GL15.glGenBuffers();
        GL15.glBindBuffer(GL15.GL_ARRAY_BUFFER, vboVertexHandle);
        GL15.glBufferData(GL15.GL_ARRAY_BUFFER, this.vbuffer, GL15.GL_STATIC_DRAW);
        GL15.glBindBuffer(GL15.GL_ARRAY_BUFFER, 0);
    }


    public void drawMesh(boolean wireframe) {
        GL11.glPushMatrix();
        GL11.glTranslatef(this.worldPositionX, 0, this.worldPositionZ);
        GL11.glPolygonMode(GL11.GL_FRONT, GL11.GL_FILL);

        GL15.glBindBuffer(GL15.GL_ARRAY_BUFFER, this.vboVertexHandle);
        GL11.glVertexPointer(3, GL11.GL_FLOAT, 0, 0L);

        GL11.glEnableClientState(GL11.GL_VERTEX_ARRAY);

        //Draw now
        if (this.dynamicVertexData.size() > 0) {
            GL11.glDrawArrays(GL11.GL_TRIANGLES, 0, this.dynamicVertexData.size() * 6);
        }
        GL15.glBindBuffer(GL15.GL_ARRAY_BUFFER, 0);
        GL11.glDisableClientState(GL11.GL_VERTEX_ARRAY);
        GL11.glPopMatrix();
    }

Block
---------------------------


    public static FloatBuffer verticiesFront(float x, float y, float z) {
        return Util.getFloatBuffer(new float[]{
                    (x * size), (y * size), (z * size), -(x * size), (y * size), (z * size), -(x * size), -(y * size), (z * size), // v0-v1-v2 (front)
                    -(x * size), -(y * size), (z * size), (x * size), -(y * size), (z * size), (x * size), (y * size), (z * size), // v2-v3-v0
                });
    }
.....

render(int x, int y, int z, int type, boolean[] faces, WorldChunk chunk) 

if (faces[0] || faces[1] || faces[2] || faces[3] || faces[4] || faces[5]) {
            if (faces[0]) {
                chunk.dynamicVertexData.add(verticiesFront(x, y, z));
            }
            if (faces[1]) {
                chunk.dynamicVertexData.add(verticiesRight(x, y, z));
            }
            if (faces[2]) {
                chunk.dynamicVertexData.add(verticiesTop(x, y, z));
            }
            if (faces[3]) {
                chunk.dynamicVertexData.add(verticiesLeft(x, y, z));
            }
            if (faces[4]) {
                chunk.dynamicVertexData.add(verticiesBottom(x, y, z));
            }
            if (faces[5]) {
                chunk.dynamicVertexData.add(verticiesBack(x, y, z));
            }
        }

 

Essentially I will use the same process of pushing points x,y,z into the buffer, one at a time, but I need to make sure that those x,y,z are effectively translated as they would be with glTranslatef(x,y,z).

 

Do I just need to find a way to translate the points or will I have bigger problems? 

 

For example, imagine a chunk 16x16x16 chunk with only 5 exposed faces for some odd reason. They are scattered throughout the chunk. When I declare a face in the vertex buffer/mesh, I need it to be entirely self-contained - the preceding point from a face should not break the next face to draw. How can I accomplish this? 

 

When you draw the points, you can never "pick up your pencil" really. I need to draw a face, stop drawing, and then draw again somewhere else, all the while keeping that information in the vertex buffer. 

 

Maybe I am greatly over complicating this. Can I draw using my old method and somehow extract the raw vertex data from OpenGL *after* I have done all the glTranslatef() calls and my mesh is complete inside OpenGL's current frame? It would be slower on the first pass, due to all the push/translate/pop, but once everything is in place, I'd love to just snap a copy of it. 

Edited by voodoodrul

Share this post


Link to post
Share on other sites

I think your verticiesFront is totally wrong. Just think about the values, those are from -size*x to +size*x. I would do just size*x and size*(x+1). Or if you really want the tile's center to be at zero: x*size - halfSize and x*size + halfSize.

 

Those positions are inside the chunk. And when you're rendering you could translate those chunks by tileSize * numberOfTiles (seems that you're already doing that). It probably also helps reading if you keep your tile size as 1. 

 

Here is some pseudo code:

// Update chunk
void update(chunk) {
    vertices[];
    foreach(tile) {
        if(face_front) {
            // I keep the tile size as 1 in this case, so the tile's start is at 0 and end is at 1
            // Let's create a quad for the current tile
            Vertex v0(tile.x,   tile.y,   tile.z+1);
            Vertex v1(tile.x+1, tile.y,   tile.z+1);
            Vertex v2(tile.x+1, tile.y+1, tile.z+1);
            Vertex v3(tile.x,   tile.y+1, tile.z+1);
            // And add it into the array
            vertices.add(v0);
            vertices.add(v1);
            vertices.add(v2);
            vertices.add(v3);
        }
    }
    // And finally, create a vertexbuffer from those vertices
    chunk.vbo = create_vbo(vertices);
}

void render(chunk) {
    // chunk.size means the number of tiles per one axis
    translate(chunk.position * chunk.size);
    render(chunk.vbo);
}

 


Can I draw using my old method and somehow extract the raw vertex data from OpenGL *after* I have done all the glTranslatef() calls and my mesh is complete inside OpenGL's current frame?

Yes, but I wouldn't suggest that, because you should use your own matrices. But just in case, it goes something like this in C:

float matrix[16]; 
glGetFloatv(GL_MODELVIEW_MATRIX, matrix);

Translation part is in the last column, matrix[12], matrix[13], matrix[14].

 

Btw, plural of vertex is vertices.

Edited by Sponji

Share this post


Link to post
Share on other sites

Thanks Sponji. You've been a big help. 

 

After spending  most of the night being frustrated, I just took a step back. The solution is now working well. I am using interleaved vertex, normal, color and texcoord arrays, but I'm having trouble dealing with the degenerative verticies

 

WorldChunk.buildMesh() calls 

 

verticies.add(Block.generate(i, j, k, EXPOSED_FACES, new float[]{0.2f, 1.0f, 0.2f}));

 

Block.generate(x,y,z,faces, colors) produces a vertex array including degenerates, like this: 

 

public static FloatBuffer generate(float x, float y, float z, boolean[] faces, float[] color) {
        float[][] cubeFaces = new float[][]{
            //Front face
            new float[]{
                //Vertex                Normals  Colors                        Texcoord
                (x+1), (y+1),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 1f, 0f,
                (x), (y+1),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 0f, 0f,
                (x), (y),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 0f, 1f, // v0-v1-v2 (front)
                (x), (y),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 0f, 1f,
                (x+1), (y),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 1f, 1f,
                (x+1), (y+1),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 1f, 0f, // v2-v3-v0
                (x+1), (y+1),(z+1),0,0,0,0,0,0,0,0
            },
            ...
            //back face
            new float[]{
                (x+1), (y), (z), 0f, 0f, -1f, color[0], color[1], color[2], 0f, 1f,
                (x), (y), (z), 0f, 0f, -1f, color[0], color[1], color[2], 1f, 1f,
                (x), (y+1), (z), 0f, 0f, -1f, color[0], color[1], color[2], 1f, 0f,// v4-v7-v6 (back)
                (x), (y+1), (z), 0f, 0f, -1f, color[0], color[1], color[2], 1f, 0f,
                (x+1), (y+1), (z), 0f, 0f, -1f, color[0], color[1], color[2], 0f, 0f,
                (x+1), (y), (z), 0f, 0f, -1f, color[0], color[1], color[2], 0f, 1f,
                (x+1), (y), (z),0,0,0,0,0,0,0,0
            }
        };
        int faceCount = 0;
        for (int i = 0; i < faces.length; i++) {
            if (faces[i] == true) {
                faceCount++;
            }
        }

        float[] values = new float[faceCount * 11 * 7];

        int ptr = 0;
        float[] degenerate = new float[11];   //store the previous vertex from some other face draw to use as our next degenerate
        boolean degen = false;                    //Have we processed an earlier face and created a degenerate ?
        for (int i = 0; i < faces.length; i++) {  //foreach face
            if (faces[i] == true) {                     //if this face is to be drawn
                float[] face = cubeFaces[i];     //get the vertex data for the face 
                
                for (int j = 0; j < face.length; j++) {    //Copy the vertex data into the return array
                    if (degen && j < 11) {                    //prepend the previous degenerate vertex for the next draw 
                        values[ptr] = degenerate[j];
                    } else {
                        values[ptr] = face[j];                //Otherwise just copy the face vertex data as-is
                    }
                    ptr++;
                    if (j > 66) {                                    //Store a degenerate vertex by copying the last interleaved vertex data from this draw
                        degenerate[j-66] = face[j];
                        degen = true;
                    }
                }
            }
        }
        return Util.getFloatBuffer(values);      //Return the final floatbuffer
    }

 
Are my degenerate verticies all wrong? I thought all I need to do is declare the same x,y,z from a previous draw, so I could not do the hacky stuff here like remembering the last vertex from an earlier face. 
 
Can't I just stick the degenerate in the vertex data like this? 
 

            //Front face
            new float[]{
                //Vertex                Normals  Colors                        Texcoord
                (x+1), (y+1),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 1f, 0f,
                (x), (y+1),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 0f, 0f,
                (x), (y),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 0f, 1f, // v0-v1-v2 (front)
                (x), (y),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 0f, 1f,
                (x+1), (y),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 1f, 1f,
                (x+1), (y+1),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 1f, 0f, // v2-v3-v0
                (x+1), (y+1),(z+1),0,0,0,0,0,0,0,0    //degenerate
            },
Edited by voodoodrul

Share this post


Link to post
Share on other sites

Okay, so I have tweaked this a bit: 
 

//Interleaved array - vertex3f, normal3f, color3f, u,v
            //bottom face
            new float[]{
                x, y, z, 0, 0, 0, 0, 0, 0, 0, 0,
                x, y, z, 0, -1, 0, color[0], color[1], color[2], 0, 1,
                x + 1, y, z, 0, -1, 0, color[0], color[1], color[2], 1, 1,
                x + 1, y, z + 1, 0, -1, 0, color[0], color[1], color[2], 1, 0,// v7-v4-v3bottom
                x + 1, y, z + 1, 0, -1, 0, color[0], color[1], color[2], 1, 0,
                x, y, z + 1, 0, -1, 0, color[0], color[1], color[2], 0, 0,
                x, y, z, 0, -1, 0, color[0], color[1], color[2], 0, 1,// v3-v2-v7
                x, y, z, 0, 0, 0, 0, 0, 0, 0, 0
            },
            //back face
            new float[]{
                x + 1, y, z, 0, 0, 0, 0, 0, 0, 0, 0,
                x + 1, y, z, 0, 0, -1, color[0], color[1], color[2], 0, 1,
                x, y, z, 0, 0, -1, color[0], color[1], color[2], 1, 1,
                x, y + 1, z, 0, 0, -1, color[0], color[1], color[2], 1, 0,// v4-v7-v6back
                x, y + 1, z, 0, 0, -1, color[0], color[1], color[2], 1, 0,
                x + 1, y + 1, z, 0, 0, -1, color[0], color[1], color[2], 0, 0,
                x + 1, y, z, 0, 0, -1, color[0], color[1], color[2], 0, 1,
                x + 1, y, z, 0, 0, 0, 0, 0, 0, 0, 0
            }

 
As you can see, each face begins with a degenerate and ends with one. I basically stick these faces together in any order, so I start drawing a chunk and draw block1->face1, block1->face 3, block2->face2, ..., blockN->faceN
 
I start drawing at offset one

GL11.glDrawArrays(GL11.GL_TRIANGLES, 1, this.numVerts);

 since I want to draw this array but I probably shouldn't start off by drawing a degenerate. 
 
The problem is that my degenerates are still wrong, or at least they are being toggled in ways I'm not expecting. They seem to be toggling on and off which results in this trippy  scene. 
 

Screen_Shot_2013_07_03_at_6_opt.png

Edited by voodoodrul

Share this post


Link to post
Share on other sites

This "degeneration" seems way too complicated, I'm not even sure what you're trying to achieve with that. Are you trying to copy the needed faces to other array or what?

Share this post


Link to post
Share on other sites

This "degeneration" seems way too complicated, I'm not even sure what you're trying to achieve with that. Are you trying to copy the needed faces to other array or what?

Without the degeneration, I would have visible triangles drawn between disconnected faces of a block. If I have 3 blocks to draw, like this:

 

block1 -> front + back + right

block2 -> right + back + top

block3 -> front + left

 

Imagine trying to compute a closed, single connected triangle fan from that. It would be tricky. You'd need to start your first face on block 1, then stop drawing (degenerate vertex) and move the vertex array "pointer" to a back face corner and start drawing the back face. When you are done with that, you'll need to stop drawing (with a degenerate) and get a vertex pointer over to a corner on block2's right face (another degenerate), then commit to actually start drawing again with yet another degenerate vertex on block2's right face. 

 

It sounds overly complicated, but I don't believe it to be - I think this is how optimal meshes are output and even Blender, when I draw a mesh like this, will output a similar mesh. Computing truly optimal meshes, however, is proven to be an NP-complete problem. 

 

I did get it functional with something that looks like this:

            new float[]{
                //Vertex         Normals      Colors                            Texcoord
                Float.NaN, Float.NaN, Float.NaN, 0, 0, 0, 0, 0, 0, 0, 0, //Degenerate reset
                x + 1, y + 1, z + 1, 0, 0, 1, color[0], color[1], color[2], 1, 0,
                x, y + 1, z + 1, 0, 0, 1, color[0], color[1], color[2], 0, 0,
                x, y, z + 1, 0, 0, 1, color[0], color[1], color[2], 0, 1, // v0-v1-v2front
                x, y, z + 1, 0, 0, 1, color[0], color[1], color[2], 0, 1,
                x + 1, y, z + 1, 0, 0, 1, color[0], color[1], color[2], 1, 1,
                x + 1, y + 1, z + 1, 0, 0, 1, color[0], color[1], color[2], 1, 0, // v2-v3-v0
                x + 1, y + 1, z + 1, 0, 0, 1, color[0], color[1], color[2], 1, 0, //End this face
                Float.NaN, Float.NaN, Float.NaN, 0, 0, 0, 0, 0, 0, 0, 0,}, //Degenerate reset
            //Right face
            new float[]{
                Float.NaN, Float.NaN, Float.NaN, 0, 0, 0, 0, 0, 0, 0, 0, //Degenerate reset
                x + 1, y + 1, z + 1, 1, 0, 0, color[0], color[1], color[2], 0, 0,
                x + 1, y, z + 1, 1, 0, 0, color[0], color[1], color[2], 0, 1,
                x + 1, y, z, 1, 0, 0, color[0], color[1], color[2], 1, 1, // v0-v3-v4right
                x + 1, y, z, 1, 0, 0, color[0], color[1], color[2], 1, 1,
                x + 1, y + 1, z, 1, 0, 0, color[0], color[1], color[2], 1, 0,
                x + 1, y + 1, z + 1, 1, 0, 0, color[0], color[1], color[2], 0, 0, // v4-v5-v0
                x + 1, y + 1, z + 1, 1, 0, 0, color[0], color[1], color[2], 0, 0,//End this face
                Float.NaN, Float.NaN, Float.NaN, 0, 0, 0, 0, 0, 0, 0, 0,}, //Degenerate reset

As you can see, there are several degens in there. Every time a face is drawn, I move the vertex "pointer" from a known bogus vertex - (Float.NaN, Float.NaN, Float.NaN) and start drawing a new face with a real x,y,z. At the end of each face I restore the degenerate back to (Float.NaN, Float.NaN, Float.NaN) - I need to ensure that no matter where I am in drawing blocks, I won't send OpenGL a bunch of duplicate values. If I used 0,0,0 for example, and I tried to draw a block actually at 0,0,0, this scheme would not work because OpenGL would be fed even more degenerates and toggle drawing of those verticies off. The easiest thing to do is start and stop each face with degenerate verticies that you know will never actually need to be drawn and that won't collide with any actual drawn vertex in your mesh. 

 

This all works fine at the moment with a few caveats. Performance is excellent - I get over 300fps on an intel hd 4000 integrated card and over 2000fps on my GTX690 with about 20 million blocks in scene. Of course those 20 million blocks become relatively few faces to actually draw since most are completely concealed - maybe a couple hundred thousand faces actually being drawn. 

 

 

I have one lingering problem that only occurs on Radeon cards. Notice these rogue faces:

radeon_solid.png

 

 

Each chunk is outlined in the red wires. Notice the "thrashing" of garbage vertex data near the origin of the chunk:

Radeon_wire.png

 

 

Whereas Intel and Nvidia cards render the scene as expected:  

 

nvidia_and_intel_1.png

 

nvidia_and_intel_2.png

Edited by voodoodrul

Share this post


Link to post
Share on other sites

This topic is 1648 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Similar Content

    • By xhcao
      Does sync be needed to read texture content after access texture image in compute shader?
      My simple code is as below,
      glUseProgram(program.get());
      glBindImageTexture(0, texture[0], 0, GL_FALSE, 3, GL_READ_ONLY, GL_R32UI);
      glBindImageTexture(1, texture[1], 0, GL_FALSE, 4, GL_WRITE_ONLY, GL_R32UI);
      glDispatchCompute(1, 1, 1);
      // Does sync be needed here?
      glUseProgram(0);
      glBindFramebuffer(GL_READ_FRAMEBUFFER, framebuffer);
      glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
                                     GL_TEXTURE_CUBE_MAP_POSITIVE_X + face, texture[1], 0);
      glReadPixels(0, 0, kWidth, kHeight, GL_RED_INTEGER, GL_UNSIGNED_INT, outputValues);
       
      Compute shader is very simple, imageLoad content from texture[0], and imageStore content to texture[1]. Does need to sync after dispatchCompute?
    • By Jonathan2006
      My question: is it possible to transform multiple angular velocities so that they can be reinserted as one? My research is below:
      // This works quat quaternion1 = GEQuaternionFromAngleRadians(angleRadiansVector1); quat quaternion2 = GEMultiplyQuaternions(quaternion1, GEQuaternionFromAngleRadians(angleRadiansVector2)); quat quaternion3 = GEMultiplyQuaternions(quaternion2, GEQuaternionFromAngleRadians(angleRadiansVector3)); glMultMatrixf(GEMat4FromQuaternion(quaternion3).array); // The first two work fine but not the third. Why? quat quaternion1 = GEQuaternionFromAngleRadians(angleRadiansVector1); vec3 vector1 = GETransformQuaternionAndVector(quaternion1, angularVelocity1); quat quaternion2 = GEQuaternionFromAngleRadians(angleRadiansVector2); vec3 vector2 = GETransformQuaternionAndVector(quaternion2, angularVelocity2); // This doesn't work //quat quaternion3 = GEQuaternionFromAngleRadians(angleRadiansVector3); //vec3 vector3 = GETransformQuaternionAndVector(quaternion3, angularVelocity3); vec3 angleVelocity = GEAddVectors(vector1, vector2); // Does not work: vec3 angleVelocity = GEAddVectors(vector1, GEAddVectors(vector2, vector3)); static vec3 angleRadiansVector; vec3 angularAcceleration = GESetVector(0.0, 0.0, 0.0); // Sending it through one angular velocity later in my motion engine angleVelocity = GEAddVectors(angleVelocity, GEMultiplyVectorAndScalar(angularAcceleration, timeStep)); angleRadiansVector = GEAddVectors(angleRadiansVector, GEMultiplyVectorAndScalar(angleVelocity, timeStep)); glMultMatrixf(GEMat4FromEulerAngle(angleRadiansVector).array); Also how do I combine multiple angularAcceleration variables? Is there an easier way to transform the angular values?
    • By dpadam450
      I have this code below in both my vertex and fragment shader, however when I request glGetUniformLocation("Lights[0].diffuse") or "Lights[0].attenuation", it returns -1. It will only give me a valid uniform location if I actually use the diffuse/attenuation variables in the VERTEX shader. Because I use position in the vertex shader, it always returns a valid uniform location. I've read that I can share uniforms across both vertex and fragment, but I'm confused what this is even compiling to if this is the case.
       
      #define NUM_LIGHTS 2
      struct Light
      {
          vec3 position;
          vec3 diffuse;
          float attenuation;
      };
      uniform Light Lights[NUM_LIGHTS];
       
       
    • By pr033r
      Hello,
      I have a Bachelor project on topic "Implenet 3D Boid's algorithm in OpenGL". All OpenGL issues works fine for me, all rendering etc. But when I started implement the boid's algorithm it was getting worse and worse. I read article (http://natureofcode.com/book/chapter-6-autonomous-agents/) inspirate from another code (here: https://github.com/jyanar/Boids/tree/master/src) but it still doesn't work like in tutorials and videos. For example the main problem: when I apply Cohesion (one of three main laws of boids) it makes some "cycling knot". Second, when some flock touch to another it scary change the coordination or respawn in origin (x: 0, y:0. z:0). Just some streng things. 
      I followed many tutorials, change a try everything but it isn't so smooth, without lags like in another videos. I really need your help. 
      My code (optimalizing branch): https://github.com/pr033r/BachelorProject/tree/Optimalizing
      Exe file (if you want to look) and models folder (for those who will download the sources):
      http://leteckaposta.cz/367190436
      Thanks for any help...

    • By Andrija
      I am currently trying to implement shadow mapping into my project , but although i can render my depth map to the screen and it looks okay , when i sample it with shadowCoords there is no shadow.
      Here is my light space matrix calculation
      mat4x4 lightViewMatrix; vec3 sun_pos = {SUN_OFFSET * the_sun->direction[0], SUN_OFFSET * the_sun->direction[1], SUN_OFFSET * the_sun->direction[2]}; mat4x4_look_at(lightViewMatrix,sun_pos,player->pos,up); mat4x4_mul(lightSpaceMatrix,lightProjMatrix,lightViewMatrix); I will tweak the values for the size and frustum of the shadow map, but for now i just want to draw shadows around the player position
      the_sun->direction is a normalized vector so i multiply it by a constant to get the position.
      player->pos is the camera position in world space
      the light projection matrix is calculated like this:
      mat4x4_ortho(lightProjMatrix,-SHADOW_FAR,SHADOW_FAR,-SHADOW_FAR,SHADOW_FAR,NEAR,SHADOW_FAR); Shadow vertex shader:
      uniform mat4 light_space_matrix; void main() { gl_Position = light_space_matrix * transfMatrix * vec4(position, 1.0f); } Shadow fragment shader:
      out float fragDepth; void main() { fragDepth = gl_FragCoord.z; } I am using deferred rendering so i have all my world positions in the g_positions buffer
      My shadow calculation in the deferred fragment shader:
      float get_shadow_fac(vec4 light_space_pos) { vec3 shadow_coords = light_space_pos.xyz / light_space_pos.w; shadow_coords = shadow_coords * 0.5 + 0.5; float closest_depth = texture(shadow_map, shadow_coords.xy).r; float current_depth = shadow_coords.z; float shadow_fac = 1.0; if(closest_depth < current_depth) shadow_fac = 0.5; return shadow_fac; } I call the function like this:
      get_shadow_fac(light_space_matrix * vec4(position,1.0)); Where position is the value i got from sampling the g_position buffer
      Here is my depth texture (i know it will produce low quality shadows but i just want to get it working for now):
      sorry because of the compression , the black smudges are trees ... https://i.stack.imgur.com/T43aK.jpg
      EDIT: Depth texture attachment:
      glTexImage2D(GL_TEXTURE_2D, 0,GL_DEPTH_COMPONENT24,fbo->width,fbo->height,0,GL_DEPTH_COMPONENT,GL_FLOAT,NULL); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE); glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_2D, fbo->depthTexture, 0);
  • Popular Now