OpenGL ES 2.0 Android slow rendering times

Started by
17 comments, last by Krypt0n 6 years, 1 month ago

Hi I am having this problem where I am drawing 4000 squares on screen, using VBO's and IBO's but the framerate on my Huawei P9 is only 24 FPS. Considering it has 8-core CPU and a pretty powerful GPU, I don't think it is not capable of drawing 4000 textured squares at 60FPS.

I checked the DMMS and found out that most of the time spent was by the put() method of the FloatBuffer, but the strange thing is that if I'm drawing these squares outside of the view frustum, the FPS increases. And I'm not using frustum culling. 

If you have any ideas what could be causing this, please share them with me. Thank you in advance.

Advertisement

This is speculation based on how I've read your post, but hopefully it's helpful..

EDIT: I'm guessing this is the same project as the "Packed VBO w/ indexing" post, so I removed the less relevant stuff. 

Regarding put() : The way I'm interpreting  this is that you're filling FloatBuffers and sending new vertices for each object every frame. If that's the case, then you're spending a lot of unnecessary time filling the buffers and sending them to the GPU. A square is a square - The same 1-unit-to-a-side vertex buffer can be created exactly once and then used for all your draws by setting the square's  position/scale/rotation in the vertex shader using uniforms. In your case, this also means you'll probably want to tint the squares with a single uniform for color, instead of vertex colors (Unless having different-colored corners is very important to your project). 

For now, this means moving your FloatBuffer usage and glBufferData calls to only happen once during startup if they aren't already. When you're in your draw loop, you can now just call glBindBuffer, glUniform, and glDrawElements. 

3 hours ago, EddieK said:

I checked the DMMS and found out that most of the time spent was by the put() method of the FloatBuffer

This sounds like you are updating the vertices of the quads on a frame-by-frame basis. Why? What are you trying to accomplish?

3 hours ago, EddieK said:

the strange thing is that if I'm drawing these squares outside of the view frustum, the FPS increases.

This makes it sound as if you are fillrate bound. Which is probably caused by having a lot of overlapping quads, causing the same pixels to be rasterised many times.

Rendering performance typically isn't a linear function of the number of triangles rendered. It's also some function of the number of pixels that have to be rasterised, the complexity of shaders, the number of distinct textures/shaders, etc.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

As swiftcoder says, it is most likely you are fillrate bound. You may also be calling OpenGL in a sub-optimal way but that is less likely to be causing the 24fps.

Android (and mobile devices in general) tends to have far lower fill rates than desktop (and also use tiled rendering). This is afaik partly to keep power use low and conserve battery. A lot of thought has to go into optimizing shaders, overdraw, transparency etc to minimize fill. Many techniques that you might take for granted on desktop are not feasible on mobile. A lot of devices just about have the horsepower to fill the screen once, and that's it.

I would highly advise you to get hold of several older, low end devices to develop / test on, as your lowest targets will give you a better idea of bottlenecks as you go along, and certain shaders will malfunction / simply not work on some devices. Precision in shaders in particular becomes a key issue in my experience. You may also have to introduce different codepaths / shaders depending on device caps.

https://www.facebook.com/permalink.php?story_fbid=1923486231219218&id=100006735798590

https://www.gamedev.net/blogs/entry/2264243-android-build-and-performance/

To add to what others have mentioned, the description you gave in lacking any meaningful info for other to provide/suggest a solution.

1. How was timing done? I keep mentioning this in every other beginner post. FPS is NOT a good performance metric. Give us absolute clock time...meaning seconds, milliseconds, nanosecond
2. Where or what is FloatBuffer  and how is it implemented?
3. What does your shaders look like ?
4. What does your rendering pass look like?

Too many unknown..if what I'm trying to get at.

additionally I guess your app window has the size of the screen, depending on depth test enabled fragment shaders kill your framerate

My shaders look like this:

Fragment shader:


precision lowp float;

uniform sampler2D texture;

varying vec4 outColor;
varying vec2 outTexCoords;
varying vec3 outNormal;

void main()
{
    vec4 color = texture2D(texture, outTexCoords) * outColor;
    gl_FragColor = vec4(color.r,color.g,color.b,color.a);
}

Vertex shader:


uniform mat4 MVPMatrix; // model-view-projection matrix
uniform mat4 projectionMatrix;

attribute vec4 position;
attribute vec2 textureCoords;
attribute vec4 color;
attribute vec3 normal;

varying vec4 outColor;
varying vec2 outTexCoords;
varying vec3 outNormal;

void main()
{
    outNormal = normal;
    outTexCoords = textureCoords;
	outColor = color;
	gl_Position = MVPMatrix * position;
}

My rendering code:


    public void bind(){
        int stride = (2 + 3 + 4) * 4;

        vertexBuffer.put(vertexArray, 0, vertexCount);
        indexBuffer.put(indexArray, 0, indexCount);

        this.vertexBuffer.position(0);
        this.indexBuffer.position(0);

        GLES20.glBindBuffer(GLES20.GL_ARRAY_BUFFER, buffers[0]);
        GLES20.glBufferData(GLES20.GL_ARRAY_BUFFER, vertexBytesAdded,
                vertexBuffer, GLES20.GL_STATIC_DRAW);

        ShaderAttributes attributes = graphicsSystem.getShader().getAttributes();
        GLES20.glVertexAttribPointer(attributes.getAttributeID(Attribute.Position), dimensions, GLES20.GL_FLOAT, false, stride, 0);
        attributes.enableAttribute(Attribute.Position);

        GLES20.glVertexAttribPointer(attributes.getAttributeID(Attribute.Color), 4, GLES20.GL_FLOAT, false, stride, 3 * 4);
        attributes.enableAttribute(Attribute.Color);

        GLES20.glVertexAttribPointer(attributes.getAttributeID(Attribute.TextureCoords), 2, GLES20.GL_FLOAT, false, stride, (4 + 3) * 4);
        attributes.enableAttribute(Attribute.TextureCoords);

        GLES20.glBindBuffer(GLES20.GL_ELEMENT_ARRAY_BUFFER, buffers[1]);
        GLES20.glBufferData(GLES20.GL_ELEMENT_ARRAY_BUFFER, indexBytesAdded,
                indexBuffer, GLES20.GL_STATIC_DRAW);

        vertexBytesAdded = 0;
        indexBytesAdded = 0;
        vertexCount = 0;
        indexCount = 0;
    }

    public void draw(int mode, int count){
        if(hasIndices){
            GLES20.glDrawElements(GLES20.GL_TRIANGLES, count, GLES20.GL_UNSIGNED_SHORT, 0);
        }else{
            GLES20.glDrawArrays(mode, 0, count);
        }

    }

 

Is there anything I can improve to increase the framerate? I'm really out of ideas here, it seems like I tried everything and I still can't get past 3000 triangles without the framerate dropping below 60. I read somewhere that MALI-400 is capable of 30 million polygons per second and I am getting only around 500,000.

8 minutes ago, EddieK said:

I'm really out of ideas here, it seems like I tried everything

You are still going to have to explain to us why you appear to be rewriting your vertex buffers every frame. That's going to be using a significant chunk of bandwidth, and causing some fun pipeline stalls.

Can you show us a screenshot of a rendered frame? It's hard to talk about this without seeing how much overdraw you have, how much depth testing is going on, etc.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

3 minutes ago, swiftcoder said:

You are still going to have to explain to us why you appear to be rewriting your vertex buffers every frame. That's going to be using a significant chunk of bandwidth, and causing some fun pipeline stalls.

Can you show us a screenshot of a rendered frame? It's hard to talk about this without seeing how much overdraw you have, how much depth testing is going on, etc.

Well because I need to update the vertex positions and their colors every frame. How else am I supposed to update the vertices? Sorry, but I am a newbie when it comes to OpenGL. 

Screenshot_20180222-225508.png

I figured out that I need to use glBufferSubData() instead. Tried it, but it was even slower

This topic is closed to new replies.

Advertisement