• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
voodoodrul

OpenGL
Optimization of many small glDrawElements() calls

10 posts in this topic

So I realize there are many better ways to do this in modern OpenGL. But here's what I have: 

class World
....
function render() {
   foreach(chunk)
        if (chunk.isReady && chunk.isVisible) {
              chunk.render()
        }
}

class WorldChunk
public int[][][] blocks;
...
function render() {
    if (DISPLAY_LIST != 0) {
         glCallList(DISPLAY_LIST);
    } else {
         buildDisplayList();
    }
}
function buildDisplayList() {
        int list = GL11.glGenLists(1);   
        GL11.glNewList(list, GL11.GL_COMPILE);    
        pushVertexData();
        GL11.glEndList();  
}
function pushVertexData() {
        glPushMatrix();
        glTranslatef(x,y,z);

        glBindBuffer(GL_ARRAY_BUFFER, Block.vboVertexHandle);  //From a single static vbo stored in Block class - VBO built once on startup
        glVertexPointer(3, GL_FLOAT, 0, 0L);
        glBindBuffer(GL_ARRAY_BUFFER, Block.vboNormalHandle);  //From a single static vbo stored in Block class - VBO built once on startup
        glNormalPointer(GL_FLOAT, 0, 0L);
        
        GL11.glEnableClientState(GL11.GL_VERTEX_ARRAY);
        GL11.glEnableClientState(GL11.GL_NORMAL_ARRAY);

        for (int i = 0; i < sizeX; i++) {
            for (int j = 0; j < sizeY; j++) {
                for (int k = 0; k < sizeZ; k++) {
                    //Determine exposed faces..
                    EXPOSED_FACES = determineExposedFaces(i,j,k);
                    //Contains boolean array [true, true, true, true, true, true] of faces to draw if they are not hidden
                    //
                    Block.render(i,j,k, EXPOSED_FACES);
                }
            }
        }
        glDisableClientState(GL_NORMAL_ARRAY);
        glDisableClientState(GL_VERTEX_ARRAY);
        glPopMatrix();
}


class Block
...
function render(int x, int y, int z, int type, boolean[] faces) {
if (faces[0] || faces[1] || faces[2] || faces[3] || faces[4] || faces[5]) {
            glPushMatrix();                                        
            glTranslatef(x + size, y + size, z + size);     
            if (faces[0]) {
                glDrawElements(GL_TRIANGLES, frontIndicies);
            }
            if (faces[1]) {
                glDrawElements(GL_TRIANGLES, rightIndicies);
            }
            if (faces[2]) {
                glDrawElements(GL_TRIANGLES, topIndicies);
            }
            if (faces[3]) {
                glDrawElements(GL_TRIANGLES, leftIndicies);
            }
            if (faces[4]) {
                glDrawElements(GL_TRIANGLES, bottomIndicies);
            }
            if (faces[5]) {
                glDrawElements(GL_TRIANGLES, backIndicies);
            }

            glPopMatrix();
        }
}
 

As you can see, I'm basically just storing a single VBO of the cube and using indicies to decide which faces to actually draw. The problem is that I know this is not an optimal way to do things. I am currently:

 

1) Rendering only those chunks within +/- X, +/- Y units of the camera

2) Doing efficient frustum culling to decide which chunks to are outside the viewing area

3) Using display lists to "bake" a chunk's geometry

4) Using a single VBO for all cubes

 

In any given scene I might have only 20k-40k faces being drawn in actual on-GPU geometry, but even this is fairly slow. Even on a GTX 690, I can only sustain about 80fps with about 50k faces on frame.

 

I know I should switch to shaders and do this the true modern way, but is there a fundamental concept I'm abusing and causing such poor performance? I know that performance is not substantially consumed elsewhere in the code based on profiling the app. For example, I hold at 1650 fps if I just comment out each of the glDrawElements() calls above so nothing is drawn. I must be overwhelming the card with poorly scheduled draw calls on tiny vertex arrays...

Edited by voodoodrul
0

Share this post


Link to post
Share on other sites

Making too many draw calls is not a good thing, it would be much better to create a single vertexbuffer from all of those cubes. Maybe use an update function to update the all the needed visible faces, just the same way as you're drawing them now, but instead you could push them in a vertexbuffer. Then you could draw the whole vertexbuffer in your render function with just one draw call.

0

Share this post


Link to post
Share on other sites

Making too many draw calls is not a good thing, it would be much better to create a single vertexbuffer from all of those cubes. Maybe use an update function to update the all the needed visible faces, just the same way as you're drawing them now, but instead you could push them in a vertexbuffer. Then you could draw the whole vertexbuffer in your render function with just one draw call.

Thanks Sponji. I guess I knew this, but I had read that it can be less efficient to draw megalithic draws (entire scene graphs for example) vs smaller drawing calls. What you suggest makes all kinds of sense. 

 

I will revamp it to push the geometry into a buffer and render each chunk in one go. What do you suggest for pushing arbitrary amounts of data into the buffer? A dynamic array and just keep pushing the ordered verticies one-at-a-time? A fixed size FloatBuffer of the "worst-case" chunk size and just populate what you need? Or is there a nicer way to push vertex data as you see it onto a vertex buffer? It seems the buffers are fixed size. 

0

Share this post


Link to post
Share on other sites

I'm not really sure which way would be the best, but usually for that kind of chunks I've done it just by uploading the whole buffer again and it has been working quite nicely. Of course it would be good to spend some time profiling those different methods, at least if it becomes a problem.

0

Share this post


Link to post
Share on other sites

So I have started the process of moving all my rendering out to a single VBO on WorldChunk. 

 

However, I'm pretty baffled as to how I can push all the vertex data into a single FloatBuffer. It's a pretty basic question about OpenGL, but maybe if we talk through it it will make more sense. 

 

The way I was rendering my scene before relied on glTranslatef() to do all the heavy lifting to get my vertex data in position. The World.render() would iterate over the WorldChunks and call render() on those. WorldChunk.render() would glTranslatef(x,y,z) into the chunk's position. Then WorldChunks.render() would iterate over the Blocks and in turn each block would glTranslatef(x,y,z) to get the block into place. The nice thing about this is that I could draw faces without worrying about vertex data from an earlier face draw. 

 

Now I need to replicate the same process on raw vertex data stored in a buffer. What I do now is let WorldChunk.generate() occur in its own thread - not using any OpenGL calls during this compile pass. The function needs to draw only those faces that are exposed and visible. I do this by determining which faces are exposed and pushing *only those faces* onto an ArrayList<FloatBuffer>. Once the list is done, I push all those individual FloatBuffers onto a single new FloatBuffer vbuffer. 

 

WorldChunk

--------------------

private void generate() {

     .....

     buildMesh();

}

 

 

public void buildMesh() {
        this.dynamicVertexData = new ArrayList<FloatBuffer>();

        for (int i = 0; i < sizeX; i++) {
            for (int j = 0; j < sizeY; j++) {
                for (int k = 0; k < sizeZ; k++) {
                    if (blocks[i][j][k] != 0) {
                            Block.render(i, j, k, 1, EXPOSED_FACES, wireframe, this);  //Writes each face as its own FloatBuffer in dynamicVertexData
                    }
                }
            }
        }

        this.vbuffer = Util.getFloatBuffer(this.dynamicVertexData.size() * 18);
        //Convert all the individual floatbuffers into one megalithic float buffer
        for (int i = 0; i < this.dynamicVertexData.size(); i++) {
            if (this.dynamicVertexData.size() > 0) {
                this.vbuffer.put(this.dynamicVertexData.get(i));
            }
        }
        this.vbuffer.flip();
}


    private void buildVBO() {
        vboVertexHandle = GL15.glGenBuffers();
        GL15.glBindBuffer(GL15.GL_ARRAY_BUFFER, vboVertexHandle);
        GL15.glBufferData(GL15.GL_ARRAY_BUFFER, this.vbuffer, GL15.GL_STATIC_DRAW);
        GL15.glBindBuffer(GL15.GL_ARRAY_BUFFER, 0);
    }


    public void drawMesh(boolean wireframe) {
        GL11.glPushMatrix();
        GL11.glTranslatef(this.worldPositionX, 0, this.worldPositionZ);
        GL11.glPolygonMode(GL11.GL_FRONT, GL11.GL_FILL);

        GL15.glBindBuffer(GL15.GL_ARRAY_BUFFER, this.vboVertexHandle);
        GL11.glVertexPointer(3, GL11.GL_FLOAT, 0, 0L);

        GL11.glEnableClientState(GL11.GL_VERTEX_ARRAY);

        //Draw now
        if (this.dynamicVertexData.size() > 0) {
            GL11.glDrawArrays(GL11.GL_TRIANGLES, 0, this.dynamicVertexData.size() * 6);
        }
        GL15.glBindBuffer(GL15.GL_ARRAY_BUFFER, 0);
        GL11.glDisableClientState(GL11.GL_VERTEX_ARRAY);
        GL11.glPopMatrix();
    }

Block
---------------------------


    public static FloatBuffer verticiesFront(float x, float y, float z) {
        return Util.getFloatBuffer(new float[]{
                    (x * size), (y * size), (z * size), -(x * size), (y * size), (z * size), -(x * size), -(y * size), (z * size), // v0-v1-v2 (front)
                    -(x * size), -(y * size), (z * size), (x * size), -(y * size), (z * size), (x * size), (y * size), (z * size), // v2-v3-v0
                });
    }
.....

render(int x, int y, int z, int type, boolean[] faces, WorldChunk chunk) 

if (faces[0] || faces[1] || faces[2] || faces[3] || faces[4] || faces[5]) {
            if (faces[0]) {
                chunk.dynamicVertexData.add(verticiesFront(x, y, z));
            }
            if (faces[1]) {
                chunk.dynamicVertexData.add(verticiesRight(x, y, z));
            }
            if (faces[2]) {
                chunk.dynamicVertexData.add(verticiesTop(x, y, z));
            }
            if (faces[3]) {
                chunk.dynamicVertexData.add(verticiesLeft(x, y, z));
            }
            if (faces[4]) {
                chunk.dynamicVertexData.add(verticiesBottom(x, y, z));
            }
            if (faces[5]) {
                chunk.dynamicVertexData.add(verticiesBack(x, y, z));
            }
        }

 

Essentially I will use the same process of pushing points x,y,z into the buffer, one at a time, but I need to make sure that those x,y,z are effectively translated as they would be with glTranslatef(x,y,z).

 

Do I just need to find a way to translate the points or will I have bigger problems? 

 

For example, imagine a chunk 16x16x16 chunk with only 5 exposed faces for some odd reason. They are scattered throughout the chunk. When I declare a face in the vertex buffer/mesh, I need it to be entirely self-contained - the preceding point from a face should not break the next face to draw. How can I accomplish this? 

 

When you draw the points, you can never "pick up your pencil" really. I need to draw a face, stop drawing, and then draw again somewhere else, all the while keeping that information in the vertex buffer. 

 

Maybe I am greatly over complicating this. Can I draw using my old method and somehow extract the raw vertex data from OpenGL *after* I have done all the glTranslatef() calls and my mesh is complete inside OpenGL's current frame? It would be slower on the first pass, due to all the push/translate/pop, but once everything is in place, I'd love to just snap a copy of it. 

Edited by voodoodrul
0

Share this post


Link to post
Share on other sites

I think your verticiesFront is totally wrong. Just think about the values, those are from -size*x to +size*x. I would do just size*x and size*(x+1). Or if you really want the tile's center to be at zero: x*size - halfSize and x*size + halfSize.

 

Those positions are inside the chunk. And when you're rendering you could translate those chunks by tileSize * numberOfTiles (seems that you're already doing that). It probably also helps reading if you keep your tile size as 1. 

 

Here is some pseudo code:

// Update chunk
void update(chunk) {
    vertices[];
    foreach(tile) {
        if(face_front) {
            // I keep the tile size as 1 in this case, so the tile's start is at 0 and end is at 1
            // Let's create a quad for the current tile
            Vertex v0(tile.x,   tile.y,   tile.z+1);
            Vertex v1(tile.x+1, tile.y,   tile.z+1);
            Vertex v2(tile.x+1, tile.y+1, tile.z+1);
            Vertex v3(tile.x,   tile.y+1, tile.z+1);
            // And add it into the array
            vertices.add(v0);
            vertices.add(v1);
            vertices.add(v2);
            vertices.add(v3);
        }
    }
    // And finally, create a vertexbuffer from those vertices
    chunk.vbo = create_vbo(vertices);
}

void render(chunk) {
    // chunk.size means the number of tiles per one axis
    translate(chunk.position * chunk.size);
    render(chunk.vbo);
}

 


Can I draw using my old method and somehow extract the raw vertex data from OpenGL *after* I have done all the glTranslatef() calls and my mesh is complete inside OpenGL's current frame?

Yes, but I wouldn't suggest that, because you should use your own matrices. But just in case, it goes something like this in C:

float matrix[16]; 
glGetFloatv(GL_MODELVIEW_MATRIX, matrix);

Translation part is in the last column, matrix[12], matrix[13], matrix[14].

 

Btw, plural of vertex is vertices.

Edited by Sponji
0

Share this post


Link to post
Share on other sites

Thanks Sponji. You've been a big help. 

 

After spending  most of the night being frustrated, I just took a step back. The solution is now working well. I am using interleaved vertex, normal, color and texcoord arrays, but I'm having trouble dealing with the degenerative verticies

 

WorldChunk.buildMesh() calls 

 

verticies.add(Block.generate(i, j, k, EXPOSED_FACES, new float[]{0.2f, 1.0f, 0.2f}));

 

Block.generate(x,y,z,faces, colors) produces a vertex array including degenerates, like this: 

 

public static FloatBuffer generate(float x, float y, float z, boolean[] faces, float[] color) {
        float[][] cubeFaces = new float[][]{
            //Front face
            new float[]{
                //Vertex                Normals  Colors                        Texcoord
                (x+1), (y+1),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 1f, 0f,
                (x), (y+1),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 0f, 0f,
                (x), (y),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 0f, 1f, // v0-v1-v2 (front)
                (x), (y),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 0f, 1f,
                (x+1), (y),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 1f, 1f,
                (x+1), (y+1),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 1f, 0f, // v2-v3-v0
                (x+1), (y+1),(z+1),0,0,0,0,0,0,0,0
            },
            ...
            //back face
            new float[]{
                (x+1), (y), (z), 0f, 0f, -1f, color[0], color[1], color[2], 0f, 1f,
                (x), (y), (z), 0f, 0f, -1f, color[0], color[1], color[2], 1f, 1f,
                (x), (y+1), (z), 0f, 0f, -1f, color[0], color[1], color[2], 1f, 0f,// v4-v7-v6 (back)
                (x), (y+1), (z), 0f, 0f, -1f, color[0], color[1], color[2], 1f, 0f,
                (x+1), (y+1), (z), 0f, 0f, -1f, color[0], color[1], color[2], 0f, 0f,
                (x+1), (y), (z), 0f, 0f, -1f, color[0], color[1], color[2], 0f, 1f,
                (x+1), (y), (z),0,0,0,0,0,0,0,0
            }
        };
        int faceCount = 0;
        for (int i = 0; i < faces.length; i++) {
            if (faces[i] == true) {
                faceCount++;
            }
        }

        float[] values = new float[faceCount * 11 * 7];

        int ptr = 0;
        float[] degenerate = new float[11];   //store the previous vertex from some other face draw to use as our next degenerate
        boolean degen = false;                    //Have we processed an earlier face and created a degenerate ?
        for (int i = 0; i < faces.length; i++) {  //foreach face
            if (faces[i] == true) {                     //if this face is to be drawn
                float[] face = cubeFaces[i];     //get the vertex data for the face 
                
                for (int j = 0; j < face.length; j++) {    //Copy the vertex data into the return array
                    if (degen && j < 11) {                    //prepend the previous degenerate vertex for the next draw 
                        values[ptr] = degenerate[j];
                    } else {
                        values[ptr] = face[j];                //Otherwise just copy the face vertex data as-is
                    }
                    ptr++;
                    if (j > 66) {                                    //Store a degenerate vertex by copying the last interleaved vertex data from this draw
                        degenerate[j-66] = face[j];
                        degen = true;
                    }
                }
            }
        }
        return Util.getFloatBuffer(values);      //Return the final floatbuffer
    }

 
Are my degenerate verticies all wrong? I thought all I need to do is declare the same x,y,z from a previous draw, so I could not do the hacky stuff here like remembering the last vertex from an earlier face. 
 
Can't I just stick the degenerate in the vertex data like this? 
 

            //Front face
            new float[]{
                //Vertex                Normals  Colors                        Texcoord
                (x+1), (y+1),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 1f, 0f,
                (x), (y+1),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 0f, 0f,
                (x), (y),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 0f, 1f, // v0-v1-v2 (front)
                (x), (y),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 0f, 1f,
                (x+1), (y),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 1f, 1f,
                (x+1), (y+1),(z+1), 0f, 0f, 1f, color[0], color[1], color[2], 1f, 0f, // v2-v3-v0
                (x+1), (y+1),(z+1),0,0,0,0,0,0,0,0    //degenerate
            },
Edited by voodoodrul
0

Share this post


Link to post
Share on other sites

Okay, so I have tweaked this a bit: 
 

//Interleaved array - vertex3f, normal3f, color3f, u,v
            //bottom face
            new float[]{
                x, y, z, 0, 0, 0, 0, 0, 0, 0, 0,
                x, y, z, 0, -1, 0, color[0], color[1], color[2], 0, 1,
                x + 1, y, z, 0, -1, 0, color[0], color[1], color[2], 1, 1,
                x + 1, y, z + 1, 0, -1, 0, color[0], color[1], color[2], 1, 0,// v7-v4-v3bottom
                x + 1, y, z + 1, 0, -1, 0, color[0], color[1], color[2], 1, 0,
                x, y, z + 1, 0, -1, 0, color[0], color[1], color[2], 0, 0,
                x, y, z, 0, -1, 0, color[0], color[1], color[2], 0, 1,// v3-v2-v7
                x, y, z, 0, 0, 0, 0, 0, 0, 0, 0
            },
            //back face
            new float[]{
                x + 1, y, z, 0, 0, 0, 0, 0, 0, 0, 0,
                x + 1, y, z, 0, 0, -1, color[0], color[1], color[2], 0, 1,
                x, y, z, 0, 0, -1, color[0], color[1], color[2], 1, 1,
                x, y + 1, z, 0, 0, -1, color[0], color[1], color[2], 1, 0,// v4-v7-v6back
                x, y + 1, z, 0, 0, -1, color[0], color[1], color[2], 1, 0,
                x + 1, y + 1, z, 0, 0, -1, color[0], color[1], color[2], 0, 0,
                x + 1, y, z, 0, 0, -1, color[0], color[1], color[2], 0, 1,
                x + 1, y, z, 0, 0, 0, 0, 0, 0, 0, 0
            }

 
As you can see, each face begins with a degenerate and ends with one. I basically stick these faces together in any order, so I start drawing a chunk and draw block1->face1, block1->face 3, block2->face2, ..., blockN->faceN
 
I start drawing at offset one

GL11.glDrawArrays(GL11.GL_TRIANGLES, 1, this.numVerts);

 since I want to draw this array but I probably shouldn't start off by drawing a degenerate. 
 
The problem is that my degenerates are still wrong, or at least they are being toggled in ways I'm not expecting. They seem to be toggling on and off which results in this trippy  scene. 
 

Screen_Shot_2013_07_03_at_6_opt.png

Edited by voodoodrul
0

Share this post


Link to post
Share on other sites

This "degeneration" seems way too complicated, I'm not even sure what you're trying to achieve with that. Are you trying to copy the needed faces to other array or what?

0

Share this post


Link to post
Share on other sites

This "degeneration" seems way too complicated, I'm not even sure what you're trying to achieve with that. Are you trying to copy the needed faces to other array or what?

Without the degeneration, I would have visible triangles drawn between disconnected faces of a block. If I have 3 blocks to draw, like this:

 

block1 -> front + back + right

block2 -> right + back + top

block3 -> front + left

 

Imagine trying to compute a closed, single connected triangle fan from that. It would be tricky. You'd need to start your first face on block 1, then stop drawing (degenerate vertex) and move the vertex array "pointer" to a back face corner and start drawing the back face. When you are done with that, you'll need to stop drawing (with a degenerate) and get a vertex pointer over to a corner on block2's right face (another degenerate), then commit to actually start drawing again with yet another degenerate vertex on block2's right face. 

 

It sounds overly complicated, but I don't believe it to be - I think this is how optimal meshes are output and even Blender, when I draw a mesh like this, will output a similar mesh. Computing truly optimal meshes, however, is proven to be an NP-complete problem. 

 

I did get it functional with something that looks like this:

            new float[]{
                //Vertex         Normals      Colors                            Texcoord
                Float.NaN, Float.NaN, Float.NaN, 0, 0, 0, 0, 0, 0, 0, 0, //Degenerate reset
                x + 1, y + 1, z + 1, 0, 0, 1, color[0], color[1], color[2], 1, 0,
                x, y + 1, z + 1, 0, 0, 1, color[0], color[1], color[2], 0, 0,
                x, y, z + 1, 0, 0, 1, color[0], color[1], color[2], 0, 1, // v0-v1-v2front
                x, y, z + 1, 0, 0, 1, color[0], color[1], color[2], 0, 1,
                x + 1, y, z + 1, 0, 0, 1, color[0], color[1], color[2], 1, 1,
                x + 1, y + 1, z + 1, 0, 0, 1, color[0], color[1], color[2], 1, 0, // v2-v3-v0
                x + 1, y + 1, z + 1, 0, 0, 1, color[0], color[1], color[2], 1, 0, //End this face
                Float.NaN, Float.NaN, Float.NaN, 0, 0, 0, 0, 0, 0, 0, 0,}, //Degenerate reset
            //Right face
            new float[]{
                Float.NaN, Float.NaN, Float.NaN, 0, 0, 0, 0, 0, 0, 0, 0, //Degenerate reset
                x + 1, y + 1, z + 1, 1, 0, 0, color[0], color[1], color[2], 0, 0,
                x + 1, y, z + 1, 1, 0, 0, color[0], color[1], color[2], 0, 1,
                x + 1, y, z, 1, 0, 0, color[0], color[1], color[2], 1, 1, // v0-v3-v4right
                x + 1, y, z, 1, 0, 0, color[0], color[1], color[2], 1, 1,
                x + 1, y + 1, z, 1, 0, 0, color[0], color[1], color[2], 1, 0,
                x + 1, y + 1, z + 1, 1, 0, 0, color[0], color[1], color[2], 0, 0, // v4-v5-v0
                x + 1, y + 1, z + 1, 1, 0, 0, color[0], color[1], color[2], 0, 0,//End this face
                Float.NaN, Float.NaN, Float.NaN, 0, 0, 0, 0, 0, 0, 0, 0,}, //Degenerate reset

As you can see, there are several degens in there. Every time a face is drawn, I move the vertex "pointer" from a known bogus vertex - (Float.NaN, Float.NaN, Float.NaN) and start drawing a new face with a real x,y,z. At the end of each face I restore the degenerate back to (Float.NaN, Float.NaN, Float.NaN) - I need to ensure that no matter where I am in drawing blocks, I won't send OpenGL a bunch of duplicate values. If I used 0,0,0 for example, and I tried to draw a block actually at 0,0,0, this scheme would not work because OpenGL would be fed even more degenerates and toggle drawing of those verticies off. The easiest thing to do is start and stop each face with degenerate verticies that you know will never actually need to be drawn and that won't collide with any actual drawn vertex in your mesh. 

 

This all works fine at the moment with a few caveats. Performance is excellent - I get over 300fps on an intel hd 4000 integrated card and over 2000fps on my GTX690 with about 20 million blocks in scene. Of course those 20 million blocks become relatively few faces to actually draw since most are completely concealed - maybe a couple hundred thousand faces actually being drawn. 

 

 

I have one lingering problem that only occurs on Radeon cards. Notice these rogue faces:

radeon_solid.png

 

 

Each chunk is outlined in the red wires. Notice the "thrashing" of garbage vertex data near the origin of the chunk:

Radeon_wire.png

 

 

Whereas Intel and Nvidia cards render the scene as expected:  

 

nvidia_and_intel_1.png

 

nvidia_and_intel_2.png

Edited by voodoodrul
0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • By Solid_Spy
      Hello, I have been working on SH Irradiance map rendering, and I have been using a GLSL pixel shader to render SH irradiance to 2D irradiance maps for my static objects. I already have it working with 9 3D textures so far for the first 9 SH functions.
      In my GLSL shader, I have to send in 9 SH Coefficient 3D Texures that use RGBA8 as a pixel format. RGB being used for the coefficients for red, green, and blue, and the A for checking if the voxel is in use (for the 3D texture solidification shader to prevent bleeding).
      My problem is, I want to knock this number of textures down to something like 4 or 5. Getting even lower would be a godsend. This is because I eventually plan on adding more SH Coefficient 3D Textures for other parts of the game map (such as inside rooms, as opposed to the outside), to circumvent irradiance probe bleeding between rooms separated by walls. I don't want to reach the 32 texture limit too soon. Also, I figure that it would be a LOT faster.
      Is there a way I could, say, store 2 sets of SH Coefficients for 2 SH functions inside a texture with RGBA16 pixels? If so, how would I extract them from inside GLSL? Let me know if you have any suggestions ^^.
    • By KarimIO
      EDIT: I thought this was restricted to Attribute-Created GL contexts, but it isn't, so I rewrote the post.
      Hey guys, whenever I call SwapBuffers(hDC), I get a crash, and I get a "Too many posts were made to a semaphore." from Windows as I call SwapBuffers. What could be the cause of this?
      Update: No crash occurs if I don't draw, just clear and swap.
      static PIXELFORMATDESCRIPTOR pfd = // pfd Tells Windows How We Want Things To Be { sizeof(PIXELFORMATDESCRIPTOR), // Size Of This Pixel Format Descriptor 1, // Version Number PFD_DRAW_TO_WINDOW | // Format Must Support Window PFD_SUPPORT_OPENGL | // Format Must Support OpenGL PFD_DOUBLEBUFFER, // Must Support Double Buffering PFD_TYPE_RGBA, // Request An RGBA Format 32, // Select Our Color Depth 0, 0, 0, 0, 0, 0, // Color Bits Ignored 0, // No Alpha Buffer 0, // Shift Bit Ignored 0, // No Accumulation Buffer 0, 0, 0, 0, // Accumulation Bits Ignored 24, // 24Bit Z-Buffer (Depth Buffer) 0, // No Stencil Buffer 0, // No Auxiliary Buffer PFD_MAIN_PLANE, // Main Drawing Layer 0, // Reserved 0, 0, 0 // Layer Masks Ignored }; if (!(hDC = GetDC(windowHandle))) return false; unsigned int PixelFormat; if (!(PixelFormat = ChoosePixelFormat(hDC, &pfd))) return false; if (!SetPixelFormat(hDC, PixelFormat, &pfd)) return false; hRC = wglCreateContext(hDC); if (!hRC) { std::cout << "wglCreateContext Failed!\n"; return false; } if (wglMakeCurrent(hDC, hRC) == NULL) { std::cout << "Make Context Current Second Failed!\n"; return false; } ... // OGL Buffer Initialization glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT); glBindVertexArray(vao); glUseProgram(myprogram); glDrawElements(GL_TRIANGLES, indexCount, GL_UNSIGNED_SHORT, (void *)indexStart); SwapBuffers(GetDC(window_handle));  
    • By Tchom
      Hey devs!
       
      I've been working on a OpenGL ES 2.0 android engine and I have begun implementing some simple (point) lighting. I had something fairly simple working, so I tried to get fancy and added color-tinting light. And it works great... with only one or two lights. Any more than that, the application drops about 15 frames per light added (my ideal is at least 4 or 5). I know implementing lighting is expensive, I just didn't think it was that expensive. I'm fairly new to the world of OpenGL and GLSL, so there is a good chance I've written some crappy shader code. If anyone had any feedback or tips on how I can optimize this code, please let me know.
       
      Vertex Shader
      uniform mat4 u_MVPMatrix; uniform mat4 u_MVMatrix; attribute vec4 a_Position; attribute vec3 a_Normal; attribute vec2 a_TexCoordinate; varying vec3 v_Position; varying vec3 v_Normal; varying vec2 v_TexCoordinate; void main() { v_Position = vec3(u_MVMatrix * a_Position); v_TexCoordinate = a_TexCoordinate; v_Normal = vec3(u_MVMatrix * vec4(a_Normal, 0.0)); gl_Position = u_MVPMatrix * a_Position; } Fragment Shader
      precision mediump float; uniform vec4 u_LightPos["+numLights+"]; uniform vec4 u_LightColours["+numLights+"]; uniform float u_LightPower["+numLights+"]; uniform sampler2D u_Texture; varying vec3 v_Position; varying vec3 v_Normal; varying vec2 v_TexCoordinate; void main() { gl_FragColor = (texture2D(u_Texture, v_TexCoordinate)); float diffuse = 0.0; vec4 colourSum = vec4(1.0); for (int i = 0; i < "+numLights+"; i++) { vec3 toPointLight = vec3(u_LightPos[i]); float distance = length(toPointLight - v_Position); vec3 lightVector = normalize(toPointLight - v_Position); float diffuseDiff = 0.0; // The diffuse difference contributed from current light diffuseDiff = max(dot(v_Normal, lightVector), 0.0); diffuseDiff = diffuseDiff * (1.0 / (1.0 + ((1.0-u_LightPower[i])* distance * distance))); //Determine attenuatio diffuse += diffuseDiff; gl_FragColor.rgb *= vec3(1.0) / ((vec3(1.0) + ((vec3(1.0) - vec3(u_LightColours[i]))*diffuseDiff))); //The expensive part } diffuse += 0.1; //Add ambient light gl_FragColor.rgb *= diffuse; } Am I making any rookie mistakes? Or am I just being unrealistic about what I can do? Thanks in advance
    • By yahiko00
      Hi,
      Not sure to post at the right place, if not, please forgive me...
      For a game project I am working on, I would like to implement a 2D starfield as a background.
      I do not want to deal with static tiles, since I plan to slowly animate the starfield. So, I am trying to figure out how to generate a random starfield for the entire map.
      I feel that using a uniform distribution for the stars will not do the trick. Instead I would like something similar to the screenshot below, taken from the game Star Wars: Empire At War (all credits to Lucasfilm, Disney, and so on...).

      Is there someone who could have an idea of a distribution which could result in such a starfield?
      Any insight would be appreciated
    • By afraidofdark
      I have just noticed that, in quake 3 and half - life, dynamic models are effected from light map. For example in dark areas, gun that player holds seems darker. How did they achieve this effect ? I can use image based lighting techniques however (Like placing an environment probe and using it for reflections and ambient lighting), this tech wasn't used in games back then, so there must be a simpler method to do this.
      Here is a link that shows how modern engines does it. Indirect Lighting Cache It would be nice if you know a paper that explains this technique. Can I apply this to quake 3' s light map generator and bsp format ?
  • Popular Now