Jump to content
  • Advertisement
Sign in to follow this  
KaiserJohan

OpenGL Multisampled textures and FBO

This topic is 2148 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I am using a deferred renderer and am exploring the use of multisampled textures (OpenGL 3.3) and I have a few questions:

 

1. Given that my color attachments to my FBO are multisampled using glTexImage2DMultisample, Do I have to likewise make my diffuse textures multisampled aswell? Since the diffuse textures on models are read in the shader during the geometry pass and then output to the diffuse color attachment

2. Does it make sense to use multisampling for the depth buffer (renderbuffer)? Why?

Share this post


Link to post
Share on other sites
Advertisement

This belongs in the OpenGL subforum

 

1. No, only the textures that are to be "resolved" need to be multisampled

2. Yes, because without it, you can't resolve multisampling

If you have a depth texture instead, the same applies

 

RenderbufferStorageMultisample for texture-less

glTexImage2DMultisample for msaa texture

 

Ultimately, in way too many cases, using multisampling is just too much hassle

And the worst part is that it doesn't do anything about subpixel misalignments

If only the hardware had automatic 2x supersampling tongue.png

Edited by Kaptein

Share this post


Link to post
Share on other sites

This belongs in the OpenGL subforum

 

1. No, only the textures that are to be "resolved" need to be multisampled

2. Yes, because without it, you can't resolve multisampling

If you have a depth texture instead, the same applies

 

RenderbufferStorageMultisample for texture-less

glTexImage2DMultisample for msaa texture

 

Ultimately, in way too many cases, using multisampling is just too much hassle

And the worst part is that it doesn't do anything about subpixel misalignments

If only the hardware had automatic 2x supersampling tongue.png

 

As for 1), I'm not sure I follow, can you elaborate abit more on what you mean by "resolved"? 

Share this post


Link to post
Share on other sites

Resolving a multisampled framebuffer typically just means blitting from it to another regular framebuffer, such as the main framebuffer (0)

You can do the resolution process manually, with some complex shader-fu, but, I honestly think with these kinds of things its best to rely on the hardware.

 

What happens is that the extra data stored in the multisampled framebuffer (which is the difference from a regular framebuffer), is collected and written to the color attachments of the destination FBO. Think of it as "Flatten Image," in any modern photo editor. The only difference being that it converts all the extra samples and merges them into pixels. The same concept applies to multisampled textures, which is a newer (3.x) and better process.

 

Also keep in mind that its slower to keep adding and removing attachments from FBOs in realtime, than to just litter your program with every FBO you need. FBOs without renderbuffers don't take much extra space; so just be liberal and create every FBO you need from the get go, and then just switch between them. All that matters is their output color/depth/stencil attachments in the end.

Share this post


Link to post
Share on other sites

Resolving a multisampled framebuffer typically just means blitting from it to another regular framebuffer, such as the main framebuffer (0)

You can do the resolution process manually, with some complex shader-fu, but, I honestly think with these kinds of things its best to rely on the hardware.

 

What happens is that the extra data stored in the multisampled framebuffer (which is the difference from a regular framebuffer), is collected and written to the color attachments of the destination FBO. Think of it as "Flatten Image," in any modern photo editor. The only difference being that it converts all the extra samples and merges them into pixels. The same concept applies to multisampled textures, which is a newer (3.x) and better process.

 

Also keep in mind that its slower to keep adding and removing attachments from FBOs in realtime, than to just litter your program with every FBO you need. FBOs without renderbuffers don't take much extra space; so just be liberal and create every FBO you need from the get go, and then just switch between them. All that matters is their output color/depth/stencil attachments in the end.

Hardware resolving of multisampled textures isn't compatible with deferred shading. You can't average together normal samples and expect the light computation to become the average of light computations using each original normal sample. The same goes for diffuse color and all other material parameters stored in the G-buffer. You need to compute the lighting per sample, at least along visible triangle edge.

 

I've found that binding FBOs is quite a bit more expensive than simply changing an attachment on an already bound FBO if you only switch a single color attachment. EDIT: This only improves CPU performance, but it does so by a very noticeable amount.

Edited by theagentd

Share this post


Link to post
Share on other sites

Practically on Windows I've not seen big differences between using one FBO for everything and swapping all attachments to it, or creating multiple FBO's.

 

However, on a Linux / NVIDIA combo quite a large performance drop (roughly halving of framerate if I remember right) resulted when using only one FBO and swapping attachments to it.

 

My solution to eliminate this performance drop was to create a set of FBOs each dedicated to a certain width, height and format of the first color attachment. Based on that the correct FBO would be selected and the attachments swapped to it as necessary.

Share this post


Link to post
Share on other sites

Practically on Windows I've not seen big differences between using one FBO for everything and swapping all attachments to it, or creating multiple FBO's.

 

However, on a Linux / NVIDIA combo quite a large performance drop (roughly halving of framerate if I remember right) resulted when using only one FBO and swapping attachments to it.

 

My solution to eliminate this performance drop was to create a set of FBOs each dedicated to a certain width, height and format of the first color attachment. Based on that the correct FBO would be selected and the attachments swapped to it as necessary.

Really? Because I've been on Windows and Nvidia hardware for a long time... Is the performance hit on the CPU or the GPU? Exactly how severe was this performance drop? Like I wrote above, I got much better CPU performance when simply swapping attachments.

Share this post


Link to post
Share on other sites

Sorry, I work mostly on Linux. I probably should have mentioned that.

 

Currently the drivers are more developed on Windows, and there may not be as much to keep in mind when designing the rendering process.

Share this post


Link to post
Share on other sites

 

Really? Because I've been on Windows and Nvidia hardware for a long time... Is the performance hit on the CPU or the GPU? Exactly how severe was this performance drop? Like I wrote above, I got much better CPU performance when simply swapping attachments.

 

The drop was proportional to the amount of rendertarget changes, which in my engine at the time mostly meant switching between rendering a shadow map, and rendering the light contribution to the scene. To get exact figures I would have to dig the offending code from the repository, which I don't think I'll do, but for a few rendertarget changes per frame it could have been roughly doubling the frametime (or halving the FPS). I didn't run a GPU profiler back then but on the CPU it looked similar to a pipeline stall - suddenly a draw call or presenting the scene would take a disproportionally large amount of time.

 

My very uneducated guess (could be totally wrong)

- The Direct3D rendertarget switching API looks like a single FBO where you swap attachments (SetRenderTarget / SetDepthStencilSurface)

- But on NVIDIA hardware it's unoptimal to switch attachments of different sizes/formats to a single FBO, can cause stalls

- So the Windows driver implements a FBO switching/virtualization behind the scenes, *and* the OpenGL portion does this for you automatically also, as it's nicer to have better performance

- In the Linux driver no such virtualization exists, as it isn't needed by Direct3D, so you must understand to do this manually or suffer poor performance

Edited by AgentC

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!