Large textures are really slow...

Started by
38 comments, last by L. Spiro 10 years, 1 month ago


Which keep the question open - does it degrade performance if no depth testing is used?

Degrade performance as compared to what, exactly? I see a few alternatives to discard: (a) not drawing the pixels in the first place via geometry changes, (b) blending the pixels with zero alpha, (c) rejecting the pixels with the stencil buffer, or (d) rejecting the pixels with the depth test.

Option (a) is pretty much always going to win on mobile hardware, followed closely by (d) and then (c). Option (b) is likely to be worse than discard, but I wouldn't expect significant gains.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Advertisement

Reread OP:

"One of my main game features (visually) is the overlay texture I'm using. This overlay texture is full screen, and covers the entire screen which gives it a nice transparent pattern."

And the post following it:

"http://shogun3d.net/images2/looptil/04.jpg"

Depth test is absolutely useless there - as i mentioned in my original post. I take you did not read any of it.


Alpha testing, the ancient silly way ("enable(GL_ALPHA_TEST)"), is deprecated (ie. not to be confused with not supported) - it should be done in shader directly (ie. "discard").

It cuts down bandwidth from areas that are hard or impossible to efficiently reach with the geometry approximations - in short, it might be worth experimenting with (although "discard" might be slightly costly [with no depth test] on amazingly-badly-engineered-garbage-hardware ... to be fair, i do not know if mobile devices do qualify for that title or not. So, might be of no use to you, but i thought i clarify the alpha test anyway).

discard is actually destructive on nearly all current mobile chips and should be avoided at all costs, especially if you are targetting mobile devices that are a few years old. There is always a better way (performance-wise) even if some are slight asset hacks. Also null blends are vastly cheaper than using discard at least on PVR chips.


Depth test is absolutely useless there - as i mentioned in my original post. I take you did not read any of it.

You are missing the point of my post: there are many methods to reject pixels, and there is no magic bullet.

Asking whether discard degrades performance is a subjective question. What are we comparing it to? The performance of not discarding pixels? The performance of using another pixel rejection path?

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

On PowerVR chips simply using discard even once in a single shader will disable a lot of optimizations hence the larger than normal performance hit. For devices using PVR you want to optimize your geometry as much as possible avoiding overdraw, and at the same time remember that alpha blend is nearly free due to the way the TBDR renders, whereas rejection using discard or depth testing is costly.

Gamedev forum software broke again - cannot use any formatting.

--------------------------

joew: "discard is actually destructive on nearly all current mobile chips and should be avoided at all costs, especially if you are targetting mobile devices that are a few years old. There is always a better way (performance-wise) even if some are slight asset hacks. Also null blends are vastly cheaper than using discard at least on PVR chips."

Thanks for actually addressing the question. Just to clarify, did you mean that "null blends are vastly cheaper than using discard when depth (and stencil) testing is disabled"? As thous would obviously be hurt quite badly by discard.

swiftcoder: "You are missing the point of my post: there are many methods to reject pixels, and there is no magic bullet.

Asking whether discard degrades performance is a subjective question. What are we comparing it to? The performance of not discarding pixels? The performance of using another pixel rejection path?"

If you would have read my relevant post then you would have known exactly what the question was meant to compare - and hence know that you do not HAVE a point. _Not relevant to the question anyway_.

To recap for your benefit (post #26 onwards): "It might be worth using discard after doing everything else (geometry approximation in particular) given that depth test is unusable. Hm, not sure how much of an impact discard has under thous circumstances - reading around about mobile devices did not clear the question, does anyone know?"

Sorry what I meant was that it is better to actually alpha blend using 1.0 rather than using discard, but I worded it improperly calling it a null blend. Basically do everything humanly possible before resorting to using discard / depth testing on any PVR chipset, as well as a most other mobile GPUs (I don't have experience with the nVidia chips so I have no idea on those). As soon as the driver cannot rely on every pixel being drawn it is forced to take a more complex and therefore slower render path disabling many key TBDR optimizations.

EDIT: also looking at the problem set in this thread I would definitely recommend using tessellated geometry reducing the amount of overdraw as much as possible. I had to do this on a project that supported iPad1 as fill rate was not exactly ideal with that hardware.

I think that tanzanite7 is getting frustrated because he's essentially asking the question:

"In a situation where HSR and early-z are not an option, is discard still bad?"

And he's getting a lot of answers about HSR and early-z.

TBH, I'd have to measure, but one possible reason that discard might be worse than the alpha blend in this scenario is that discard is adding a dynamic branch to the shader (by dynamic, I mean that some fragments within a 2x2 quad might take the discard path and others may not). However, I'd imagine that the relative costs are very hardware and shader dependent, I wouldn't be surprised if you could create a shader where discard clearly improves performance (e.g. if the shader was doing complex per pixel deferred lighting operations). However, I think that where you're just blending a simple texture onto the screen, then the cost of the discard probably outweighs the cost of the alpha blend.

since you have a texture layed over screen to blend up to target, if I'm not mistaken, I came up with this idea that you could prerender it to the target stencil and blend it upon stencil test, what will allow you for early pre-rejection of pixel. Since stencil buffer is rather yes/no, you would not achive partial transparency, but maybe it would be a benefit, since you would reject totaly transparent parts, and alphablend only those partial ones. Maybe a crazy idea, I know, but you could try it and see, since it demands the texture transparency rendered to stencil only once, if you will not clear the stencil and use it for something else during frame.

iOS uses a TBDR (tile-based deferred rendering) architecture which fully eliminates the need for sorting to reduce overdraw, but as has been mentioned the use of discard anywhere in a shader, blending, alpha-testing, and writing to gl_FragDepth forces this to be disabled, degrading performance significantly.
With blending, early-Z can still be used by the hardware (while not taking full advantage of TBDR), so it will still be your best first choice if available. Keep in mind that is modifies that depth buffer (should be no problem here).

Otherwise, alpha-testing and discard are almost identical, except that discard has the advantage of appearing earlier in the rendering path and potentially allowing a large portion of your shader not to be executed.
If you do use discard, remember to put it as close to the start of your shader as possible in order to discard as early as possible.

If you must use any of discard/alpha-testing, blending, or modified gl_FragDepth, trim the polygons to remove as many “whitespace pixels” from the rasterizing step as possible (this was already suggested and implemented). Another possibility is Z-prepass, in which the Z-buffer is prepared early via very simple no-lighting shaders with a very early discard; once this is done later passes can at least use early-Z.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

This topic is closed to new replies.

Advertisement