Losing 15 fps doesn't tell us much. Dropping from 1000 fps to 985 fps is a huge meh, and 15 fps dropping to 0 fps is the other extreme. I know you're somewhere in the middle, but no clue where.
If your texture is actually 1024x768 instead of 1024x1024, make it 1024x1024 (feel free to atlas with something else if you can) and compress it as much as you're willing for some speedup. If you are using a "bad" texture format, change to a proper texture format (probably PVRTC). If you're using OpenGL ES 1.x instead of 2, use OES_draw_texture. Use the lowest color precision you find acceptable. If you can, do multitexture instead of multipass. Basically, just follow the best practices Apple tells you in their docs. I'm sure I left stuff out.
I don't think 32x32 textures are faster than 1024x1024. If you really can't atlas with something else to fill out the 1024x1024, maybe 3 512x512 would be faster, but I'm not sure since you're trading one performance area for another and it will probably be dependent on what you're doing.
And that recommendation to avoid as much of the transparency as possible is great advice. Try varying numbers of polygons to see where the sweet spot is, but I wouldn't be surprised if you're better off with 100+ polygons to minimize the amount of transparent overlay pixels to render rather than a single fullscreen quad. Still use as few textures as possible though, even if you have a lot of polygons.
1. My game has to run at a solid 60fps to remain playable. Without the overlay image, I get that. 45fps is not tolerable for my game due to the internal mechanics I've chosen to use. Sorry, should have said that earlier.
2. As I stated earlier, I tried POT-ing the texture, and it had zero impact. It's an issue related to fill rates. I'm going to try compressing it to use PVRTC soon, because I'm using RGBA8, which i know isn't too optimal, especially since it's black and white. I'm using OpenGL ES 2.0 right now, and now that I realize it, I did forget to switch the precision back to low, thanks. I read the Apple docs, and learned quite a bit from it, and implemented the necessary techniques to speed up my code, but none that address my particular issue.
3. No, that's not what I was saying. What I was saying was similar to your 4th paragraph. Use rows of 32x32 quads and not render the fully transparent area.
The tile-based architectures of most mobile GPUs have a different set of optimal use cases compared to more traditional GPUs. Check the advice in Performance Tuning for Tile-Based Architectures from the OpenGL Insights book to make sure you're not doing anything that triggers poor performance.
I saw that article once before, and forgot about it. This time, I'll try reading it more thoroughly.
but another reason I'm asking this is because I'd also like to release another game that uses post-processing effects on mobile devices.
You should aim for later devices then ipad1 then. It's almost 4 years old now.
It all depends on what else you do of course, but I wouldn't plan on too fancy full screen post effects on anything more then 1-2 years old.
With the ES3 devices coming out now, the performance start reaching levels where you can do some pretty fancy stuff, but they are still very far from desktop.
That's not that strange considering their difference in size and power consumption though!
Yes, well said. I do plan on aiming for only iOS7 compatible devices in the future (iPad1 is all I have right now, hence the reason why I'm using it).
I do own an OpenGL ES 3.0 compatible device now, a 2nd Gen Nexus 7 which does nicely. Haven't ported my game to Android yet.
How often is the texture being update ?
-If every frame, are you uploading the entire texture every frame even in the case where parts of the texture is not modified?
What format is the texture data in ?
-If its a fat format like RGBA, do you actually need 4-channels ? Would 2-channel or 1-channel texture suffice.
There is a multitude of reason as why your texture upload is so slow, but the solution probably boils down to taking a step back and looking at the requirements. I've seen you have tried a few optimization that haven't given you the desired result. Without more information like texture format, update frequency, its kinda difficult to give a concrete answer.
1. The texture is not dynamic, so it's not updated every frame.
2. It's RGBA, and compression wouldn't hurt, of course. I've read some posts by others who have the same issue, and they said that compression only got them a few extra fps (like 4). Still worth implementing either way.
Shogun.