OpenGl Es analysis

Started by
6 comments, last by sienaiwun 7 years, 6 months ago

Done

Advertisement

To stress the vertex shader, I'd render a high polygon mesh (if you can't get access to one, then make something in code by subdividing a tetrahedron or icosahedron to make a sphere perhaps). Then render it multiple times, but use the scissor rectangle so that you only rasterize a tiny amount of the screen.

With your fragment shader test, can I just check that each of your screen quads has transparency? If not, then mobile hardware often has hidden-surface-removal which will invalidate your test as only the top quad will actually be rendered.

I have no idea how you would measure the cost of rasterization as a distinct measure from fragment shader, or even if trying to do so makes any sense.

Thanks @Columbo for your reply.

With my fragment shader test, I do it with depth write off and depth test off. I think it is similar way to get a massive amout of overdraw like transparent rendering.

With your vertex shader way, can I do it by projecting the vertexes to a given fixed tiny point so that they are not rasterized?

I also need to measure the cost of varying parameters between VShader and FShader and whether the bandwidth is the bottleneck as well. Is there some idea?

Turning depth write/tests off is not enough on tiled based hardware typical to mobile devices. I believe both PowerVR and Mali chips (qualcomm too probably) do analysis in hidden surface removal hardware which will eliminate the hidden layers even if there's no depth buffer at all.

I would imagine that projecting vertices to a tiny point would work.

Measuing cost of varyings will be hard. Apart from anything else, you're limited to 8 on some devices so it's difficult to really stress it in isolation.

For any decent analysis, you would need access to the hardware counters for each device. Rendering quad, null VS/FS just gives a high level view of whats going on, so if thats the end goal, then no need got go any further. Even then those test are highly subjective since even there are other variables in play that may skew your result on one architecture vs the next.

For isolating vertex shader performance I would suggest:

  • You need a large data set, in a vertex buffer in GPU memory, which can be submitted in a single draw call.
  • Use points rather than triangles to simplify (and therefore minimize the overhead of) primitive assembly, clipping, culling, etc.
  • Transform each point to outside of the view frustum so that it's discarded by the pipeline as soon as possible after the vertex shader, and no subsequent shader stages run.

That will give you a reasonably accurate number, but I'd suggest that the number you get is actually useless. It has no bearing whatsoever on the kind of performance you'll get in a real-world program, and the mention of pipeline stages you optimize or skip above should hint why: because a real-world program won't be optimizing or skipping these stages, and will therefore have extra load on both the CPU and GPU that your test doesn't measure.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

Thanks @Columbo for your reply.

I do my Fshader test with alpha blending on, although it may involve unnecessary blending operation after Fshader.

Thanks @cgrant and @mhagain for your kindly reply.

I knew it might make little sense to measure a shader in this naive way and it was difficult to isolate different stages. But since the actual temporal performances are closely guarded secrets of the chip vendors, that is the only way I can come up with. I took up this idea from ''In-Depth Performance Analyses of DirectX 9 Shading Hardware." from ShaderX3.

I knew this kind of measurement is old-fashioned, but is there some better ideas to give a coarse value to the artists that how fine is the geometry and how fine is the geometry and how much is the max chararacters a mobile device can support? There are several 3-rd party measurement tools like GPU ShaderAnalyze from AMD or ShaderPerf2 from NV. But they are not friendly to mobile Gpus.

This topic is closed to new replies.

Advertisement