[iPhone] Optimizing a long loop

Started by
11 comments, last by wodinoneeye 14 years, 1 month ago
Quote:Original post by Headkaze
Another optimization I've been considering is changing from GL_TRIANGLES to GL_TRIANGLES_STRIP. Is it relatively easy to convert the data over to this format?
Art tools can do it, but it is probably not recommended.

First off, you would be much better off with indexed triangles rather than drawing individual triangles. That way you only transmit the points once and reuse them. If you are transmitting raw triangles every frame you are probably sending 3X the necessary data.

I'm not sure about the iPhone's various GPU editions, but most modern cards perform better with "triangle soup" rather than triangle strips. The parallel processing in the card means that each sub-processor can be more efficient if you let it do its own optimization rather than telling it a specific order to draw the strips.

Quote:I'm not sure moving it will help either, that part of the forum seems quite dead.
It is very active but low volume.

Many people watch it. Few professionals post questions to it because of the console maker's NDA terms They will post to the official private groups.

When questions are asked, however, there are relatively more industry professionals lurking on the forum. The answers are generally more useful and targeted directly to the hardware's need.
Advertisement
I know you've "solved it" but you could optimise this simple loop by removing all the aliasing. I would code the loop something like this:
while(vertexCount--){    fp32 fCx = pCurrent->x;    fp32 fCy = pCurrent->y;    fp32 fCz = pCurrent->z;    fp32 fNx = pNext->x;    fp32 fNy = pNext->y;    fp32 fNz = pNext->z;    pDest->x = fCx + (fNx - fCx) * fAlpha;    pDest->y = fCy + (fNy - fCy) * fAlpha;    pDest->z = fCz + (fNz - fCz) * fAlpha;    ++pCurrent;    ++pNext;    ++pDest;}
I hope this helps?
Quote:Original post by Rompa
I know you've "solved it" but you could optimise this simple loop by removing all the aliasing. I would code the loop something like this:
while(vertexCount--){    fp32 fCx = pCurrent->x;    fp32 fCy = pCurrent->y;    fp32 fCz = pCurrent->z;    fp32 fNx = pNext->x;    fp32 fNy = pNext->y;    fp32 fNz = pNext->z;    pDest->x = fCx + (fNx - fCx) * fAlpha;    pDest->y = fCy + (fNy - fCy) * fAlpha;    pDest->z = fCz + (fNz - fCz) * fAlpha;    ++pCurrent;    ++pNext;    ++pDest;}
I hope this helps?





If the compiler isnt up to snuff then ordering them better for register use might be a tiny bit better.... (set compiler options for speed over code size..) An SSE solution would probably be alot better but Im not sure if the ARM used has anything like that.



fp32 fCx = pCurrent->x;
pDest->x = fCx + (pNext->x - fCx) * fAlpha;

fp32 fCy = pCurrent->y;
pDest->y = fCy + (pNext->y - fCy) * fAlpha;

fp32 fCz = pCurrent->z;
pDest->z = fCz + (pNext-> - fCz) * fAlpha;



Another possibility is if the actions are repetitive (or when they are like in walking) to cache each frame in the repeated sequence so that they only have to be calculated once.


Also another solution (if the objects are often at distance) is to use a simpler model (LOD level of detail) when they are far enough away that the simpler detail wont matter.

--------------------------------------------[size="1"]Ratings are Opinion, not Fact

This topic is closed to new replies.

Advertisement