few large polys vs many small polys

Started by
14 comments, last by chibitotoro0_0 13 years, 10 months ago
Just to clarify, is the problem that: when you draw more pixels, it takes more time?
Advertisement
I have one plane being rendered with 2 tris.
Regardless of the texture I used (small or large) it get the same FPS switching between the two.

If I change the width and height to occupy more space on the screen, FPS drops. If I use a large texture on a small area poly, I still get high FPS.
Quote:Original post by chibitotoro0_0
If I change the width and height to occupy more space on the screen, FPS drops. If I use a large texture on a small area poly, I still get high FPS.
Rendering a quad is an O(n^2) operation. A 100x100 pixel quad is 10,000 times more expensive than a 1x1 pixel quad. A 1000x1000 pixel quad is 100 times more expensive than the 100x100 quad.
Quote:Original post by Gorax
The only reason I could understand the increase in performance simply by switching from GL_LINEAR to GL_NEAREST would be if you're not using mipmaps.
I'm guessing this is running on a software-implemented version of GL1.1. Bilinear texture filtering is *really* slow in software compared to nearest-neighbour filtering.

Without running your app on the target hardware you're not going to be able to get much meaningful performance data.

As it is, assuming you're currently using a software OpenGL renderer, it makes perfect sense that drawing more pixels is slower than drawing less pixels, and fetching bilinearly filtered texture samples is way slower than fetching un-filtered samples.

Compare the pseudo-calculations for a large quad with texture filtering:
for( int p=0; p != 786432; ++p ){  w  = tc*tsize;  t0 = floor(w);  w3 = w-t0;  w2 = vec2(w3.x,1-w3.y);  w1 = vec2(1-w3.x,w3.y);  w0 = 1-w3;  t1 = t0 + vec2(0,1);  t2 = t0 + vec2(1,0);  t3 = t0 + vec2(1,1);  out = tex[t0]*w0.x*w0.y + tex[t1]*w1.x*w1.y + tex[t2]*w2.x*w2.y + tex[t3]*w3.x*w3.y;}
...with the pseudo-calculations for a small quad without filtering:
for( int p=0; p != 10000; ++p ){  t0 = floor(tc*tsize);  out = tex[t0];}
If this is running in a simulator, without any specialised hardware assistance, then the top loop is obviously an order of magnitude slower than the bottom loop.
You're using OpenGL ES 1.1 and OpenGL ES 2.0. This is NOT the same as OpenGL 1.1 and OpenGL 2.0, you should be careful with that difference.
It might not be of importance, but wikipedia mentions a difference in the way, triangles are rendered, compared to what you might expect:

http://en.wikipedia.org/wiki/PowerVR#Technology

In short: They don't rasterize the triangles as they come, but store them until all triangles are committed, sort them into cells, and render each cell independent of the others in what seems to be pretty close to raytracing. They call this tile based deferred rendering.

What's funny is that this should show the exact opposite behavior of what you described...
Thanks for all the replies!

Yes it seems right now that it indeed is a O(n^2) operation.

With my previous app rendering it on the iPhone did yield similar results as my computer but my computer at that time was using a NVIDIA 8600GT and now my unibody is using a NVIDIA 9600GT. Both of which are much more powerful than the iPhone 3G video processing and even the A4 chips on the upcoming cards.

So is it ok to assume that all openGL ES 1.1 stuff is software based rendering? since the 3G obviously doesn't have a discrete video processor like the 3GS or A4 chip on the 4G?

This topic is closed to new replies.

Advertisement