• Advertisement
Sign in to follow this  

Large textures are really slow...

This topic is 1409 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Alpha testing on mobile devices is generally slower, and should be avoided.  This is what I've read many times while googling iOS optimizations.  Second, I'm using OpenGL ES 2.0, and it doesn't appear that GL_ALPHA_TEST is supported.  Gives me an error when I try to enable it because it's not defined.

Alpha test interferes quite badly with early depth testing on typical hardware - however, you do not need nor benefit from early depth testing anyway (as far as i can see from the picture posted).

Alpha testing, the ancient silly way ("enable(GL_ALPHA_TEST)"), is deprecated (ie. not to be confused with not supported) - it should be done in shader directly (ie. "discard").

It cuts down bandwidth from areas that are hard or impossible to efficiently reach with the geometry approximations - in short, it might be worth experimenting with (although "discard" might be slightly costly [with no depth test] on amazingly-badly-engineered-garbage-hardware ... to be fair, i do not know if mobile devices do qualify for that title or not. So, might be of no use to you, but i thought i clarify the alpha test anyway).

Share this post


Link to post
Share on other sites
Advertisement


It cuts down bandwidth from areas that are hard or impossible to efficiently reach with the geometry approximations - in short, it might be worth experimenting with (although "discard" might be slightly costly [with no depth test] on amazingly-badly-engineered-garbage-hardware ... to be fair, i do not know if mobile devices do qualify for that title or not. So, might be of no use to you, but i thought i clarify the alpha test anyway).

 

Discard is indeed a bit expensive on a lot of mobile hardware, but to be fair, you have to cut some corners when you have just about a square inch and a few watts to work with!

Share this post


Link to post
Share on other sites

Thread is kind of long for a simple problem. I will leave with this though:

 

 

First you would also need to modify your blending function, and make sure your framebuffer is RGBA.

No.......   If you are blending any 2 images. As you pointed out SRC, and inverse of source.
Green Texture in Framebuffer,  I apply a red sprite on top with .2 alpha,        (0,1,0)*(1.0-.2)    + (1,0,0)*(.2) = (.2, .8,0)

Ok throw the red spirte in the framebuffer and blend the green like the OP is doing:
Red sprite in Framebuffer, I apply the green overlay with .8 alpha over it,     (1,0,0)*(1.0-.8)  +  (0,1,0)*(.8) = (.2,.8,0)

Definitely worth a try. When there are several different sprites drawn at the same location, you may get unintended results, depends on what they look like, and depends on how many sprites you really have layered.

Share this post


Link to post
Share on other sites

Sure, but how would the sprite know what alpha to use?

 

That level is defined in the alpha channel of the overlay. If the sprite moves (which sprites usually do), it would need to use another alpha if you want to draw it on top and get the same result as if you drew it below.

 

So you need to use destination alpha instead of source, and can't have an alpha channel in the sprite too.

Edited by Olof Hedman

Share this post


Link to post
Share on other sites

Discard is indeed a bit expensive on a lot of mobile hardware, but to be fair, you have to cut some corners when you have just about a square inch and a few watts to work with!

 

 

I did a bit or reading about how mobile devices do things, but did not get any smarter in regards to their "discard" - every time they told that discard can be detrimental to performance they did so in regards to hidden-surface-removal/early-depth testing (which is the obvious disaster-case regardless of hardware - and likely to hit tiny-die devices especially hard).

 

Which keep the question open - does it degrade performance if no depth testing is used? I can not think of any reason why it should/could.

Share this post


Link to post
Share on other sites


Which keep the question open - does it degrade performance if no depth testing is used?

Degrade performance as compared to what, exactly? I see a few alternatives to discard: (a) not drawing the pixels in the first place via geometry changes, (b) blending the pixels with zero alpha, (c) rejecting the pixels with the stencil buffer, or (d) rejecting the pixels with the depth test.

 

Option (a) is pretty much always going to win on mobile hardware, followed closely by (d) and then (c). Option (b) is likely to be worse than discard, but I wouldn't expect significant gains.

Share this post


Link to post
Share on other sites

Reread OP:

 

"One of my main game features (visually) is the overlay texture I'm using.  This overlay texture is full screen, and covers the entire screen which gives it a nice transparent pattern."

 

And the post following it:
 

"http://shogun3d.net/images2/looptil/04.jpg"

 

Depth test is absolutely useless there - as i mentioned in my original post. I take you did not read any of it.

Share this post


Link to post
Share on other sites

Alpha testing, the ancient silly way ("enable(GL_ALPHA_TEST)"), is deprecated (ie. not to be confused with not supported) - it should be done in shader directly (ie. "discard").

It cuts down bandwidth from areas that are hard or impossible to efficiently reach with the geometry approximations - in short, it might be worth experimenting with (although "discard" might be slightly costly [with no depth test] on amazingly-badly-engineered-garbage-hardware ... to be fair, i do not know if mobile devices do qualify for that title or not. So, might be of no use to you, but i thought i clarify the alpha test anyway).

discard is actually destructive on nearly all current mobile chips and should be avoided at all costs, especially if you are targetting mobile devices that are a few years old. There is always a better way (performance-wise) even if some are slight asset hacks. Also null blends are vastly cheaper than using discard at least on PVR chips.

Edited by joew

Share this post


Link to post
Share on other sites


Depth test is absolutely useless there - as i mentioned in my original post. I take you did not read any of it.

You are missing the point of my post: there are many methods to reject pixels, and there is no magic bullet.

 

Asking whether discard degrades performance is a subjective question. What are we comparing it to? The performance of not discarding pixels? The performance of using another pixel rejection path?

Share this post


Link to post
Share on other sites
On PowerVR chips simply using discard even once in a single shader will disable a lot of optimizations hence the larger than normal performance hit. For devices using PVR you want to optimize your geometry as much as possible avoiding overdraw, and at the same time remember that alpha blend is nearly free due to the way the TBDR renders, whereas rejection using discard or depth testing is costly.

Share this post


Link to post
Share on other sites

Gamedev forum software broke again - cannot use any formatting.

 

--------------------------

joew: "discard is actually destructive on nearly all current mobile chips and should be avoided at all costs, especially if you are targetting mobile devices that are a few years old. There is always a better way (performance-wise) even if some are slight asset hacks. Also null blends are vastly cheaper than using discard at least on PVR chips."

 

Thanks for actually addressing the question. Just to clarify, did you mean that "null blends are vastly cheaper than using discard when depth (and stencil) testing is disabled"? As thous would obviously be hurt quite badly by discard.

 

swiftcoder: "You are missing the point of my post: there are many methods to reject pixels, and there is no magic bullet.

Asking whether discard degrades performance is a subjective question. What are we comparing it to? The performance of not discarding pixels? The performance of using another pixel rejection path?"

 

If you would have read my relevant post then you would have known exactly what the question was meant to compare - and hence know that you do not HAVE a point. _Not relevant to the question anyway_.

 

To recap for your benefit (post #26 onwards): "It might be worth using discard after doing everything else (geometry approximation in particular) given that depth test is unusable. Hm, not sure how much of an impact discard has under thous circumstances - reading around about mobile devices did not clear the question, does anyone know?"

Share this post


Link to post
Share on other sites

Sorry what I meant was that it is better to actually alpha blend using 1.0 rather than using discard, but I worded it improperly calling it a null blend. Basically do everything humanly possible before resorting to using discard / depth testing on any PVR chipset, as well as a most other mobile GPUs (I don't have experience with the nVidia chips so I have no idea on those). As soon as the driver cannot rely on every pixel being drawn it is forced to take a more complex and therefore slower render path disabling many key TBDR optimizations.

 

EDIT: also looking at the problem set in this thread I would definitely recommend using tessellated geometry reducing the amount of overdraw as much as possible. I had to do this on a project that supported iPad1 as fill rate was not exactly ideal with that hardware. 

Edited by joew

Share this post


Link to post
Share on other sites

I think that tanzanite7 is getting frustrated because he's essentially asking the question:

 

"In a situation where HSR and early-z are not an option, is discard still bad?"

 

And he's getting a lot of answers about HSR and early-z.

 

TBH, I'd have to measure, but one possible reason that discard might be worse than the alpha blend in this scenario is that discard is adding a dynamic branch to the shader (by dynamic, I mean that some fragments within a 2x2 quad might take the discard path and others may not). However, I'd imagine that the relative costs are very hardware and shader dependent, I wouldn't be surprised if you could create a shader where discard clearly improves performance (e.g. if the shader was doing complex per pixel deferred lighting operations). However, I think that where you're just blending a simple texture onto the screen, then the cost of the discard probably outweighs the cost of the alpha blend.

Share this post


Link to post
Share on other sites

since you have a texture layed over screen to blend up to target, if I'm not mistaken, I came up with this idea that you could prerender it to the target stencil and blend it upon stencil test, what will allow you for early pre-rejection of pixel. Since stencil buffer is rather yes/no, you would not achive partial transparency, but maybe it would be a benefit, since you would reject totaly transparent parts, and alphablend only those partial ones. Maybe a crazy idea, I know, but you could try it and see, since it demands the texture transparency rendered to stencil only once, if you will not clear the stencil and use it for something else during frame.

Share this post


Link to post
Share on other sites
iOS uses a TBDR (tile-based deferred rendering) architecture which fully eliminates the need for sorting to reduce overdraw, but as has been mentioned the use of discard anywhere in a shader, blending, alpha-testing, and writing to gl_FragDepth forces this to be disabled, degrading performance significantly.
With blending, early-Z can still be used by the hardware (while not taking full advantage of TBDR), so it will still be your best first choice if available. Keep in mind that is modifies that depth buffer (should be no problem here).

Otherwise, alpha-testing and discard are almost identical, except that discard has the advantage of appearing earlier in the rendering path and potentially allowing a large portion of your shader not to be executed.
If you do use discard, remember to put it as close to the start of your shader as possible in order to discard as early as possible.

If you must use any of discard/alpha-testing, blending, or modified gl_FragDepth, trim the polygons to remove as many “whitespace pixels” from the rasterizing step as possible (this was already suggested and implemented). Another possibility is Z-prepass, in which the Z-buffer is prepared early via very simple no-lighting shaders with a very early discard; once this is done later passes can at least use early-Z.


L. Spiro Edited by L. Spiro

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement