All the graphics seem to be working fine, but it seems that the biggest issue is fillrate / texture samples / shader complexity.
So far I've identified the biggest issues: :blink:
- dynamic water shader
- particles
- pcf shadows
[attachment=35790:dynamic.jpg]
I'm aiming for 60fps even on low end phones, if possible. It seems to me that I should have graphic options so the user can get the best graphics / performance for their device.
Some of the issues are a consequence of using a scrolling pre-rendered background, with colour and a custom depth texture (as depth textures are not supported on some devices). When rendering the background as the viewer moves around I currently use 2 passes, one for the colour and one to write the depth into an RGBA, then in realtime I render dynamic objects on top (e.g. the animals) and I read from the depth texture, decode it and compare to the fragment z value.
One obvious speedup is to remove the depth comparison with the background for shaders that do not require it. For the particles, they look much nicer when they are hidden by trees / vegetation, but still look acceptable without it.
The PCF shadows I always suspected were going to be a problem. I was using PCF shadows for the pre-rendered scrolling background (only need refreshing every few frames) and PCF shadow on the animals as they get shaded by trees etc. Taking this down to a single sample greatly sped the shader up, so it is obviously a bottleneck. The single sample shadows look very bad however, so I think the options should be:
- turning them off for animals
- perhaps simplifying them for background or using some kind of pre-calculation.
- There is also the option of randomized jitter / rotating sample window to get a softer shadow with less performance hit.
The biggest question I am still facing is how to do the water. :huh: Is it actually *feasible* to run a complex water shader covering the whole screen on these devices (worst case for sea parts) or do they lack the horse power? I am actually considering (!!) pre-rendering a static water as part of the background. Then bodging in some kind of depth blue colour for parts of animals that are below the surface on each frame. It won't look amazing but should be super fast. I could even add some dynamic particles or something on the water surface to make it look at least a little dynamic.
This is what static water might look like: :blink:
[attachment=35791:simpleshader.jpg]
I am currently just rendering a giant quad for the water, then using depth testing against the custom depth texture to handle visibility. But this is a bottleneck, as well as the calculations of the water colour. I have already considered drawing polys for the rough area where water will be (around the shores etc) rather than the whole screen, however this will only help in best case scenarios, not in worst cases. Maybe there is a cheaper way of deciding where it can draw the water? I would use the standard z buffer but that option does not appear to be open, given that I am using a custom encoded depth texture, and the shaders cannot write to the standard z buffer without an opengl extension (which may or may not be present lol :rolleyes: ).
I could maybe wangle another background luminance layer or something for where to draw realtime water, but this seems a lot of effort for not much reward (it would only be saving on decoding the depth texture and doing a comparison).
Another question that does occur is, whether all of these bottlenecks are simple bottlenecks, or whether I am stalling the pipeline somewhere with a dependency, and could I double / triple buffer the resources to alleviate the problem.
Anyway sorry for this long rambling post, but I would welcome and thoughts / ideas - probably along the lines of whether these should actually be causing such problems, and any ideas around them, particularly the water. In fact any suggestions for super fast simple water shaders would be useful .. I suspect just adding 2 scrolled tiled textures might produce something useable enough, if the texture reads are faster than calculations within the shader.