iPhone Game Optimizations. Ultimate Guide
#1 Members - Reputation: 167
Posted 09 October 2012 - 05:07 AM
You can find it here.
Feel free to post your proposals, bug reports and feedback ;)
P.S. English isn't my native language, so i will be thankful any comments about my writing style
#3 Crossbones+ - Reputation: 5187
Posted 09 October 2012 - 11:04 AM
L. Spiro
I spent most of my life learning the courage it takes to go out and get what I want. Now that I have it, I am not sure exactly what it is that I want. - L. Spiro 2013
L. Spiro Engine: http://lspiroengine.com
L. Spiro Engine Forums: http://lspiroengine.com/forums
#7 Staff - Reputation: 8927
Posted 09 October 2012 - 08:33 PM
- Jason Astle-Adams.
From my blog: 20 ways to advertise your game | What next? Intermediate to advanced C++
How to make games WITHOUT programming | 4 reasons you aren't a successful indie developer
#9 Members - Reputation: 348
Posted 14 October 2012 - 09:01 AM
One tip that sounds a bit funny:
"Driver will pad your NPOT textures to next biggest POT value"
I never heard of anything like that. Just wondering it that's really true.
I did hear of a different bug regarding NPOT. If you have a NPOT width that isn't multiple of 4 it will allocate too much memory.
#11 Members - Reputation: 348
Posted 14 October 2012 - 02:40 PM
That's crazy. I thought the whole purpose of NPOT textures was to save memory.I meant, that your NPOT texture, that is of 480x480x32, that will eat 700kb of your client memory will be padded to 512x512x32 by the driver and will eat 800kb of "video" memory...
#13 Crossbones+ - Reputation: 5187
Posted 23 November 2012 - 02:22 AM
StiX already answered but I wanted to give some more detail and an idea of how critical this is.A few slides were a bit ambiguous. You want to align to 4 bytes manually, or let GPU do it for you?
When you call glDrawElements() or glDrawArrays() there are a few things that can cause it to take a slower path which causes it to copy your entire vertex buffer to a new location in “GPU” RAM (of course there is no such thing in a unified memory model but it is easier to think of memory managed by the driver as GPU RAM).
One way is to simply not use VBO’s. Another way is to pass misaligned data (attributes not aligned to 4 bytes).
These copies obviously involve a lot of extra cycles, even though it uses an optimized memcpy() when possible (it can’t when realignment is necessary), and to give you an overhead of just how much that is, on an average game it means the difference between 20 and 45 FPS.
In going into extreme detail, if you benchmark with Time Profiler and you see a function called glDraw[Arrays|Elements]_ACC_ES2Exec() taking a large amount of time, check your vertex alignments.
If you see glDraw[Arrays|Elements]_IMM_ES2Exec() taking a lot of time then your problem is likely the lack of a VBO.
L. Spiro
Edited by L. Spiro, 23 November 2012 - 02:37 AM.
I spent most of my life learning the courage it takes to go out and get what I want. Now that I have it, I am not sure exactly what it is that I want. - L. Spiro 2013
L. Spiro Engine: http://lspiroengine.com
L. Spiro Engine Forums: http://lspiroengine.com/forums






