@L. Spiro - I would expect PC GPUs to still have "fast clear" optimisation in hardware. This was present in PC hardware 5 years ago, so it should be still be around. It will likely depend on the texture format as to whether it's supported or not (more likely supported on depth and 8-bit channel formats).please don't teach people that, that's simply not true. it has the exact same cost as drawing a full screen triangle on iOS.
The biggest difference that pops out to me is in regards to clearing.
Clearing buffers on iOS devices just sets a flag. It doesn’t actually copy memory over the whole buffer etc.
It is instantaneous as long as it does not cause a resolve. You can avoid resolves by calling glDiscardFramebufferEXT() before glClear().
This means that what makes for a long operation on PC is virtually free on iOS.
on PC it depends on what GPU you have, if you use one that has HiZ, it will just invalidate some areas
@Krypt0n - HiZ can't possibly help when clearing a non-depth texture, and not every depth texture will necessarily be assigned a corresponding hierarchical representation (which yes, should support fast clear) -- there might only be a single HiZ "buffer" which the current depth target can make use of (but isn't permanently assigned to that target).
On several widely popular (read: older) GPUs, a change to shader constants causes the same performance impacts as changing the shader program itself (which may or may not be a bottleneck for your scene, only sensible profiling would tell) -- so sorting by shader program isn't going to do anything on these GPUs if you're also changing any shader constants between draw-calls, as these are causing internal program switches anyway (unless grouping by shader helps you to reduce changes to shader-constants).
Actually on all platform you should first sort by shader (at least if overdraw is not an issue e.g. due to a z-pass). it's not related to drivers, it's how hardware works
All regular PC GPUs do that (buffer your data/commands for at least one frame) -- this has nothing to do with mobile/deferred -- deferred/PowerVR style buffering is a completely different concept implemented between the primitive rasteriser and the pixel shader.
You should never update buffers once you start drawing, as the PowerVR gpu is deferred, this means, it keeps all buffers until the end of the frame, so when you lock, it will have to allocate new memory etc.
it doesn't matter how many times you buffer, if you try to modify a buffer that was used already in this frame for drawing, the driver will have to allocate a temporal new buffer etc.
This performance behavior is true for nearly all mobile GPUs, PowerVR, Adreno, Mali... to my knowledge only Tegra has no deferred rendering and it could be fine with updates in the middle of the frame, but even there, you could stall until the HW is done with the last drawcall that was using this particular buffer.