Sign in to follow this  
GL1zdA

FBO - feedback loop / NPOT textures

Recommended Posts

I have two questions about FBOs. First: I'm doing multipass rendering. On my 8600GT it works fine if there's a feedback loop in the shader (I use the same texture for sampling and as renderer target). But the specification of FBO says such loop can result in undefined behaviour. I tried using 2 FBOs and switching between them (ping-pong), but it has severe impact on performance (about 40% less FPS). My question is: is there some extension that allows feedback loops, that I should check before deciding to use a feedback loop, or is it just luck that it works? Second: Is there any way to convince the ATI Radeon 9800 Pro to use NPOT textures as color attachments for the FBO? Everything works fine as long as I use 512x512 or 1024x1024 textures, but with 640x480 etc. I get FRAMEBUFFER_INCOMPLETE_DIMENSIONS_EXT. Thanks in advance for all answers.

Share this post


Link to post
Share on other sites
Just luck that it works, the GPU processes fragments in parallel, you have no way to ensure that you're reading from the original texture or reading from an already processed texel.

For the second question, the Radeon 9800 doesn't support NPOT textures, so, you simply can't make a NPOT sized FBO. Get a modern card, such as a HD4xxx or a GeForce 8xxx and you'll have no trouble.

Share this post


Link to post
Share on other sites
Quote:
Original post by GL1zdA
Second: Is there any way to convince the ATI Radeon 9800 Pro to use NPOT textures as color attachments for the FBO? Everything works fine as long as I use 512x512 or 1024x1024 textures, but with 640x480 etc. I get FRAMEBUFFER_INCOMPLETE_DIMENSIONS_EXT.


Try to not use mipmaps. set min and max filter to GL_LINEAR or GL_NEAREST.

I'm surprised that performance would be impacted so much just by adding another FBO. Try to make 1 FBO and make your attached texture the same dimensions.

Share this post


Link to post
Share on other sites
[quote name='HuntsMan' timestamp='1232337772' post='4384598']
Just luck that it works, the GPU processes fragments in parallel, you have no way to ensure that you're reading from the original texture or reading from an already processed texel.
[/quote]

Would it be ok to render to the same texture I'm sampling from if I only sample from exactly the same texel I render to? In that case I know the texel haven't been processed yet, as no other calculations other than the current calculation alters the state of the texel.

Share this post


Link to post
Share on other sites
With this extension you can, but I think you can't render to the same location
http://www.opengl.org/registry/specs/NV/texture_barrier.txt

[quote]This extension relaxes the restrictions on rendering to a currently bound texture and provides a mechanism to avoid read-after-write hazards.[/quote]

Share this post


Link to post
Share on other sites
[quote name='V-man' timestamp='1322740638' post='4889378']
With this extension you can, but I think you can't render to the same location
[url="http://www.opengl.org/registry/specs/NV/texture_barrier.txt"]http://www.opengl.or...ure_barrier.txt[/url]
[/quote]

Thanks for the tip!

Why do you think you can't render to the same location? As far as I can see, there's actually a requirement you do exactly that: [i]There is only a single read and write of each texel, and the read is in the fragment shader invocation that writes the same texel (e.g. using "texelFetch2D(sampler, ivec2(gl_FragCoord.xy), 0);").[/i]

Share this post


Link to post
Share on other sites
[quote name='jeinor' timestamp='1322751242' post='4889425']
[quote name='V-man' timestamp='1322740638' post='4889378']
With this extension you can, but I think you can't render to the same location
[url="http://www.opengl.org/registry/specs/NV/texture_barrier.txt"]http://www.opengl.or...ure_barrier.txt[/url]
[/quote]

Thanks for the tip!

Why do you think you can't render to the same location?[/quote]

You can (with and without that extension). You are allowed to read each texel exactly once before writing to the same one. This always works, also in absence of that extension (although strictly according to the spec, it is undefined).

What does [b]not [/b]work without this extension is doing the same thing [b]several times [/b]in a row, because there is a chance you read data in the second or third pass that has not been written out from cache yet. This is what the extension provides. Any shader running after glTextureBarrierNV will see anything a shader before that call has written.

What also [b]does not [/b]work is reading several texels when writing to the same texture, since you do not know which ones are processed and written to at what time, and there is no synchronisation.

Share this post


Link to post
Share on other sites
[quote name='samoth' timestamp='1322755539' post='4889435']
You can (with and without that extension). You are allowed to read each texel exactly once before writing to the same one. This always works, also in absence of that extension (although strictly according to the spec, it is undefined).

What does [b]not [/b]work without this extension is doing the same thing [b]several times [/b]in a row, because there is a chance you read data in the second or third pass that has not been written out from cache yet. This is what the extension provides. Any shader running after glTextureBarrierNV will see anything a shader before that call has written.

What also [b]does not [/b]work is reading several texels when writing to the same texture, since you do not know which ones are processed and written to at what time, and there is no synchronisation.
[/quote]

Thanks for the clarification. I actually started a new topic over at opengl.org for my particular case before this thread came alive again, and got some answers there as well ([url="http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=306941"]http://www.opengl.or...t&Number=306941[/url]). From how I understood them, I was told it would be impossible to solve my problem without the use of extensions or to re-think.

However, if I can read the texel exactly once in the same fragment shader as you say, I think that would solve my problem. I googled around some more, and found this page: [url="http://www.opengl.org/wiki/GLSL_:_common_mistakes#Sampling_and_Rendering_to_the_Same_Texture"]http://www.opengl.or...he_Same_Texture[/url]. From that page: "[i]You still don't get to read and write to the same location in a texture at the same time unless there is only a single read and write of each texel, and the read is in the fragment shader invocation that writes the same texel.[/i]" - that's exactly what I need to do (if I understand them correctly).

How I want it to work (drawing [b]2D only[/b]):
[list=1][*]Send down some triangles using [b]glDrawArrays()[/b] or [b]glVertex3f()[/b] followed by [b]glEnd()[/b].[*]Have the fragment shader execute, and let it check the current value of the texel to write [b]once[/b], then write that texel back [b]once[/b].[*]Send down some more triangles using [b]glDrawArrays()[/b] or [b]glVertex3f()[/b], again followed by [b]glEnd()[/b].[*]Have the fragment shader execute again, possibly reading the same texel and writing it again (but seperated by [b]glEnd()[/b]).[*](and so on)[/list]
When you say exactly once, what do you mean? Would the above steps have written one or more times to the same texel (considering the [b]glEnd()[/b] in between)?

When I think of it, what I really want to do is to implement a fully custom [b]glBlendFunc[/b]. If the above technique is allowed, I actually could implement the same functionality as [b]glBlendFunc[/b] provides. Maybe that makes what I want to achieve more clear.

Thanks :)

Share this post


Link to post
Share on other sites
[quote name='GL1zdA' timestamp='1232333431' post='521407']I tried using 2 FBOs and switching between them (ping-pong), but it has severe impact on performance (about 40% less FPS).[/quote]
That's bogus. [url="http://www.mvps.org/directx/articles/fps_versus_frame_time.htm"]Don't measure in FPS, measure in milliseconds[/url]. If, say, you're running at 2000 FPS and now you're running at 1000 FPS, that's 50% less but actually it's an extremely cheap operation (0,1 ms).
If you're running at 10 FPS and now 5 FPS, it's 50% too; yet it's hell of an expensive operation (200 ms).

[quote]You can (with and without that extension). You are allowed to read each texel exactly once before writing to the same one. This always works, also in absence of that extension (although strictly according to the spec, it is undefined).[/quote]
While it may work on every hardware known to date (does it?). I wanted to make more emphasis in that, like you've just said, it's strictly speaking undefined behavior. Which means nothing prevents in the future a specific driver just refusing to draw your GL calls when it detects you're attempting to do a feedback loop.
It's undefined, which means a vendor is allowed to reject the operation or start throwing garbage from there on, even if it could be drawn perfectly fine in it's hardware.

Cheers
Dark Sylinc

Share this post


Link to post
Share on other sites
[quote name='Matias Goldberg' timestamp='1322761430' post='4889477']
While it may work on every hardware known to date (does it?). I wanted to make more emphasis in that, like you've just said, it's strictly speaking undefined behavior. Which means nothing prevents in the future a specific driver just refusing to draw your GL calls when it detects you're attempting to do a feedback loop.
It's undefined, which means a vendor is allowed to reject the operation or start throwing garbage from there on, even if it could be drawn perfectly fine in it's hardware.
[/quote]

Thanks for making that clear. I'm convinced.

I've read some use a ping-pong technique for rendering to texture, meaning they render to one texture and sample from another, and then have them switch places for the next render loop. The reason that works is the GPU finalizes all pixels when switching texture, right? Otherwise the target texture could not have been used as sampling texture in the next render loop (the same problem as with sampling/rendering to the same texture, we can not be certain all pixels have been updated before beginning our new render loop).

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this