Sign in to follow this  
CDProp

OpenGL Yet another Deferred Shading / Anti-aliasing discussion...

Recommended Posts

CDProp    1451

Hi. My apologies if this discussion has been played out already. This topic seems to come up a lot, but I did a quick search and did not quite find the information I was looking for. I'm interested in knowing what is considered the best practice these days, with respect to deferred rendering and anti-aliasing. These are the options, as I understand them:

 

Use some post-processed blur like FXAA.

I've tried enabling NVidia's built-in FXAA support, but the results were not nearly acceptable. Maybe there is another technique that can do a better job?

 

Use a multi-sampled MRT, and then handle your own MSAA resolve.

I've never done this before, and I'm anxious to try it for the sake of learning how, but it is difficult for me to understand how this is much better than super-sampling. If I understand MSAA correctly, the memory requirements are the same as for super-sampling. The only difference is that your shader is called fewer times. However, with deferred shading, this really only seems to help save a few material shader fragments, which don't seem very expensive in the first place. Unless I'm missing something, you still have to do your lighting calculations once per sample, even if all of the samples have the same exact data in them. Are the material shader savings (meager, I'm guessing) really worth all of the hassle?

 

Use Deferred Lighting instead of Deferred Shading.

You'll still have aliased lighting, though, and it comes at the expense of an extra pass (albeit depth-only, if I understand the technique correctly). Is anybody taking this option these days?

 

Use TXAA

NVidia is touting some TXAA technique on their website, although details seem slim. It seems to combine 2X MSAA with some sort of post-process technique. Judging from their videos, the results look quite acceptable, unlike FXAA. I'm guessing that the 2X MSAA would be handled using your own custom MSAA resolve, as described above, but I don't know what processing happens after that.

 

These all seem like valid options to try, although none of them seem to be from the proverbial Book. It seems to me, though, that forward rendering is a thing of the past, and I would love to be able to fill my scene with lights. I could try implementing all of these techniques as an experiment, but since they each come with a major time investment and learning curve, I was hoping that someone could help point a lost soul in the right direction.

 

Bonus questions: Is there a generally agreed-upon way to lay out your G-Buffer? I'd like to use this in conjunction with HDR, and so memory/bandwidth could start to become a problem, I would imagine. Is it still best practice to try to reconstruct position from depth? Are half-float textures typically used? Are any of the material properties packed or compressed in a particular way? Are normals stored as x/y coordinates, with the z-coord calculated in the shader?

 

I'm using OpenGL, if it matters.

Share this post


Link to post
Share on other sites
ic0de    1012

I use FXAA but I don't use whats built in with the graphics driver. What you can do for better results is download the FXAA 3.9 shader (used to be on Timothy Lottes blog but I can't find it anymore), it has some conditional compilation setup in it which you can use to tweak the settings. This method is far better then using the graphics driver because you can apply it at a more optimal spot in your rendering pipeline (preventing some unwanted blur). Amazingly the same shader works for both hlsl and glsl and it will work on Intel and AMD gpus as well as consoles. It is important to note that you must have some method to generate luminosity before running the fxaa shader (this is pretty trivial).

Edited by ic0de

Share this post


Link to post
Share on other sites
CDProp    1451

Thanks so much for your replies.

 

If I download the FXAA 3.9 shader and integrate it in my pipeline, will it help beyond allowing me to avoid blurring HUD/text elements? The reason I ask is that I have a lot of objects in my scene with long, straight edges -- particularly buildings and chain link fences, but also some vehicles as well -- with which FXAA seems to work particularly poorly. At a distance, these objects create wild, shimmering jaggies that are very distracting. Will downloading the shader actually improve this? Here are a couple examples:

 

W81rCSo.png

 

zCL7hBQ.png

 

As you move the viewpoint around, those broken lines crawl and shimmer.

 

I've spent the last couple hours reading through the links that you provided, MJP, and it has given me a lot of food for thought. 

 

I'm particularly intrigued by the Forward+ idea, because the idea of using an MRT with HDR and MSAA is starting to sound prohibitive. Let's say I use the G-Buffer layout that you mentioned in your March 2012 blog post on Light-Indexed Deferred rendering, except the albedo buffers need to be bumped up to 64bpp to accommodate HDR rendering (right?). Then, multiply the whole thing by 4 for 4x MSAA, and I have a seriously fat buffer. And what do I do about reflection textures? If I want to do planar reflections or refractions, for example. That seems like it'd be another big fat g-buffer. Am I thinking about this correctly? Plus, you have the lack of flexibility with material parameters that comes with deferred rendering. 

 

Edit: On the other hand, isn't it somewhat expensive the loop through a runtime-determined number of lights inside a fragment shader? If it isn't, then why did the old forward-renderers bother compiling different shaders for different numbers of lights? Why did they not, instead, just allow 8 lights per shader (say), and use a uniform (numLights) to determine how many to actually loop through? Sure, you only get per-object light lists that way, which is imprecise, but is it really slower than having a separate compute shader step that determines the light list on a per-pixel basis?

 

But if I do end up going the Deferred w/ MSAA route (which I'm kind of leaning toward), using edge detection to find the pixels I actually want to do per-sample lighting on, and doing per-pixel lighting for all others, sounds like it will be a huge time-saver, even if I have to eat a huge amount of memory.

 

And in any case, the information you provided helped me discover all sorts of inefficiencies with the way we're doing deferred shading, so it looks like I can probably gain some speed just with a few optimizations.

Edited by CDProp

Share this post


Link to post
Share on other sites
Hodgman    51222

If I download the FXAA 3.9 shader and integrate it in my pipeline, will it help beyond allowing me to avoid blurring HUD/text elements?

There's a lot of options to the FXAA shader that you can tweak yourself.
Also, if you don't actually integrate it into your pipeline, then it's not actually in your game. It's pretty unkind to your users to say "to make my game look best, go into your nVidia driver panel (sorry ATI/intel users) and enable these hacks".

I have a lot of objects in my scene with long, straight edges -- particularly buildings and chain link fences, but also some vehicles as well -- with which FXAA seems to work particularly poorly. At a distance, these objects create wild, shimmering jaggies that are very distracting. Will downloading the shader actually improve this? Here are a couple examples:

In those example pictures, there's a lot of "information" missing -- e.g. lines that have pixels missing from them, so they've become dashed lines rather than solid lines.
Post-processing AA techniques (like FXAA) will not be able to repair these lines.
MSAA / super-sampling techniques will mitigate this issue, making the threshold for a solid line turning into a dashed line smaller.
Another solution is to fix the data going in to the renderer -- if you've got tiny bits of geometry that are going to end up being thinner than a pixel, they should be replaced with some kind of low-LOD version of themselves.  e.g. if those fences were drawn with as a textured plane with alpha blending, you'd get soft anti-aliased lines by default, even with no anti-aliasing technique used.

Edit: On the other hand, isn't it somewhat expensive the loop through a runtime-determined number of lights inside a fragment shader?

On D3D9-era GPUs, yes, the branch instructions are quite costly. They'll also compile the loop to something like:
for( int i=0; i!=255; ++i ) { if( i >= g_lightCount ) break; ..... }
On more modern cards, branching is less costly. The biggest worry is when nearby pixels take different branches (e.g. one pixel wants to loop through 8 lights, but it's neighbour wants to loop through 32 lights) -- but most of these Forward+-ish techniques mitigate this issue by clustering/tiling pixels together, so that neighbouring pixels are likely to use the same loop counters and data. Edited by Hodgman

Share this post


Link to post
Share on other sites
CDProp    1451


Also, if you don't actually integrate it into your pipeline, then it's not actually in your game. It's pretty unkind to your users to say "to make my game look best, go into your nVidia driver panel (sorry ATI/intel users) and enable these hacks".

 

Hah, good point. This isn't a product that I plan on releasing into the wild, so that's not even something that I considered.

 


On more modern cards, branching is less costly. The biggest worry is when nearby pixels take different branches (e.g. one pixel wants to loop through 8 lights, but it's neighbour wants to loop through 32 lights) -- but most of these Forward+-ish techniques mitigate this issue by clustering/tiling pixels together, so that neighbouring pixels are likely to use the same loop counters and data.

 

Could I trouble you for more information about this? Why is it that nearby pixels like to have the same number of pixels, and what is meant by "nearby"? Would an 8x8 tile suffice? (that seems to be the common size, from what I've been reading)

Share this post


Link to post
Share on other sites
cowsarenotevil    3001
Secondly, a different, but related concept is that pixel shaders are generally executed on a "quad" of 2x2 pixels, this is so that mipmapping/gradient instructions (e.g. ddx/ddy) can be implemented. If a triangle only covers a single pixel, the shader will likely still be computed for a whole 2x2 quad, with 3 of the pixels being thrown away.

 

Not relevant to OP, but that just answered a question I've had for a while. I assumed that was how derivative functions worked, but for some reason I'd always wondered how mipmapping worked in shaders. It never occurred to me that that would be all it took.

Share this post


Link to post
Share on other sites
marcClintDion    435

There is a document that deals with that issue of jaggedly lines for fences and power cables and such.  

 

Anti-aliasing alone can't help you since the lines off in the distance end up being much less than one pixel.    Look for the section titled "Phone-wire Anti-Aliasing"

 

http://www.humus.name/Articles/Persson_GraphicsGemsForGames.pdf

Share this post


Link to post
Share on other sites
CDProp    1451

The power cables in our scene are basically just rectangular strips, onto which a texture of a catenary-shaped wire is mapped. This was done by a modeler long before my time, so I don't know why he decided to do it that way, but one fortunate side-effect is that the texture filtering takes care of everything for me. The fence poles and light posts are a different story, but because they never end up being much smaller than a pixel, the 4x MSAA seems to do an okay job on them. Thanks for the PDF, though, it has a lot of useful information.

 

Hodgman, thanks again for your help. I'm much obliged to everyone here. Although I must say, you guys are only encouraging me to ask more questions. =)

Share this post


Link to post
Share on other sites
MJP    19753

If I download the FXAA 3.9 shader and integrate it in my pipeline, will it help beyond allowing me to avoid blurring HUD/text elements? The reason I ask is that I have a lot of objects in my scene with long, straight edges -- particularly buildings and chain link fences, but also some vehicles as well -- with which FXAA seems to work particularly poorly. At a distance, these objects create wild, shimmering jaggies that are very distracting. Will downloading the shader actually improve this? Here are a couple examples:

 

This is one of the cases that post-processing AA solutions like FXAA have a lot of difficulty with. You really need to rasterize at a higher resolution to make high-frequency geometry look better, and that's exactly what MSAA does. Something like FXAA is fundamentally limited in terms of the information it has available to it, which makes it unable to fix these sorts of situations. Some sort of temporal solution that looks at data from the previous frame can help, but is still usually less effective than MSAA.

 

I'm particularly intrigued by the Forward+ idea, because the idea of using an MRT with HDR and MSAA is starting to sound prohibitive. Let's say I use the G-Buffer layout that you mentioned in your March 2012 blog post on Light-Indexed Deferred rendering, except the albedo buffers need to be bumped up to 64bpp to accommodate HDR rendering (right?). Then, multiply the whole thing by 4 for 4x MSAA, and I have a seriously fat buffer. And what do I do about reflection textures? If I want to do planar reflections or refractions, for example. That seems like it'd be another big fat g-buffer. Am I thinking about this correctly? Plus, you have the lack of flexibility with material parameters that comes with deferred rendering. 

 

Albedo values should always be [0, 1], since they're essentially the ratio of light reflecting off a surface. With HDR the input lighting values are often > 1 and the same goes for the output lighting value, but albedo is always [0,1]. But even with that it's true that a G-Buffer with 4xMSAA enabled can use up quite a bit of memory, which is definitely a disadvantage. Material parameters can also potentially be an issue. If you require a lot of input parameters to your lighting, then you need a lot of G-Buffer textures which increases memory usage and bandwidth. With forward rendering you don't necessarily need to always think about what parameters need to be packed into your G-Buffer, which can potentially make it easier for experimenting with new lighting models

 

Edit: On the other hand, isn't it somewhat expensive the loop through a runtime-determined number of lights inside a fragment shader? If it isn't, then why did the old forward-renderers bother compiling different shaders for different numbers of lights? Why did they not, instead, just allow 8 lights per shader (say), and use a uniform (numLights) to determine how many to actually loop through? Sure, you only get per-object light lists that way, which is imprecise, but is it really slower than having a separate compute shader step that determines the light list on a per-pixel basis?

 

Sure it can be expensive to loop through lights in a shader, but this is essentially what you do in any deferred renderer if you have multiple lights overlapping any given pixel. However with traditional deferred rendering you end up sampling your G-Buffer and blending the fragment shader output for each light, which can consume quite a bit of bandwidth. With forward rendering or tiled deferred rendering you only need to sample your material parameters once and the summing of light contributions happens in registers, which avoids excessive bandwidth usage. The main problem with older forward renderers is that older GPU's and shading languages lacked the flexibility needed to build per-tile lists and dynamically loop over them in a fragment shader. Shaders did not have support for reading from generic buffers, and fragment shaders couldn't dynamically read indexed data from shader constants. You also didn't have compute shaders with shared memory, which is currently the best way to build per-tile lists of lights. But it's true that determining a set of lights per-object is fundamentally the same thing, the main difference is that the level of granularity is different. Also you typically do per-object association on the CPU, while with tiled forward or deferred you do the association on the GPU using the depth buffer to determine if a light affects a given tile.

Edited by MJP

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Similar Content

    • By Kjell Andersson
      I'm trying to get some legacy OpenGL code to run with a shader pipeline,
      The legacy code uses glVertexPointer(), glColorPointer(), glNormalPointer() and glTexCoordPointer() to supply the vertex information.
      I know that it should be using setVertexAttribPointer() etc to clearly define the layout but that is not an option right now since the legacy code can't be modified to that extent.
      I've got a version 330 vertex shader to somewhat work:
      #version 330 uniform mat4 osg_ModelViewProjectionMatrix; uniform mat4 osg_ModelViewMatrix; layout(location = 0) in vec4 Vertex; layout(location = 2) in vec4 Normal; // Velocity layout(location = 3) in vec3 TexCoord; // TODO: is this the right layout location? out VertexData { vec4 color; vec3 velocity; float size; } VertexOut; void main(void) { vec4 p0 = Vertex; vec4 p1 = Vertex + vec4(Normal.x, Normal.y, Normal.z, 0.0f); vec3 velocity = (osg_ModelViewProjectionMatrix * p1 - osg_ModelViewProjectionMatrix * p0).xyz; VertexOut.velocity = velocity; VertexOut.size = TexCoord.y; gl_Position = osg_ModelViewMatrix * Vertex; } What works is the Vertex and Normal information that the legacy C++ OpenGL code seem to provide in layout location 0 and 2. This is fine.
      What I'm not getting to work is the TexCoord information that is supplied by a glTexCoordPointer() call in C++.
      Question:
      What layout location is the old standard pipeline using for glTexCoordPointer()? Or is this undefined?
       
      Side note: I'm trying to get an OpenSceneGraph 3.4.0 particle system to use custom vertex, geometry and fragment shaders for rendering the particles.
    • By markshaw001
      Hi i am new to this forum  i wanted to ask for help from all of you i want to generate real time terrain using a 32 bit heightmap i am good at c++ and have started learning Opengl as i am very interested in making landscapes in opengl i have looked around the internet for help about this topic but i am not getting the hang of the concepts and what they are doing can some here suggests me some good resources for making terrain engine please for example like tutorials,books etc so that i can understand the whole concept of terrain generation.
       
    • By KarimIO
      Hey guys. I'm trying to get my application to work on my Nvidia GTX 970 desktop. It currently works on my Intel HD 3000 laptop, but on the desktop, every bind textures specifically from framebuffers, I get half a second of lag. This is done 4 times as I have three RGBA textures and one depth 32F buffer. I tried to use debugging software for the first time - RenderDoc only shows SwapBuffers() and no OGL calls, while Nvidia Nsight crashes upon execution, so neither are helpful. Without binding it runs regularly. This does not happen with non-framebuffer binds.
      GLFramebuffer::GLFramebuffer(FramebufferCreateInfo createInfo) { glGenFramebuffers(1, &fbo); glBindFramebuffer(GL_FRAMEBUFFER, fbo); textures = new GLuint[createInfo.numColorTargets]; glGenTextures(createInfo.numColorTargets, textures); GLenum *DrawBuffers = new GLenum[createInfo.numColorTargets]; for (uint32_t i = 0; i < createInfo.numColorTargets; i++) { glBindTexture(GL_TEXTURE_2D, textures[i]); GLint internalFormat; GLenum format; TranslateFormats(createInfo.colorFormats[i], format, internalFormat); // returns GL_RGBA and GL_RGBA glTexImage2D(GL_TEXTURE_2D, 0, internalFormat, createInfo.width, createInfo.height, 0, format, GL_FLOAT, 0); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST); DrawBuffers[i] = GL_COLOR_ATTACHMENT0 + i; glBindTexture(GL_TEXTURE_2D, 0); glFramebufferTexture(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0 + i, textures[i], 0); } if (createInfo.depthFormat != FORMAT_DEPTH_NONE) { GLenum depthFormat; switch (createInfo.depthFormat) { case FORMAT_DEPTH_16: depthFormat = GL_DEPTH_COMPONENT16; break; case FORMAT_DEPTH_24: depthFormat = GL_DEPTH_COMPONENT24; break; case FORMAT_DEPTH_32: depthFormat = GL_DEPTH_COMPONENT32; break; case FORMAT_DEPTH_24_STENCIL_8: depthFormat = GL_DEPTH24_STENCIL8; break; case FORMAT_DEPTH_32_STENCIL_8: depthFormat = GL_DEPTH32F_STENCIL8; break; } glGenTextures(1, &depthrenderbuffer); glBindTexture(GL_TEXTURE_2D, depthrenderbuffer); glTexImage2D(GL_TEXTURE_2D, 0, depthFormat, createInfo.width, createInfo.height, 0, GL_DEPTH_COMPONENT, GL_FLOAT, 0); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST); glBindTexture(GL_TEXTURE_2D, 0); glFramebufferTexture(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, depthrenderbuffer, 0); } if (createInfo.numColorTargets > 0) glDrawBuffers(createInfo.numColorTargets, DrawBuffers); else glDrawBuffer(GL_NONE); if (glCheckFramebufferStatus(GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE) std::cout << "Framebuffer Incomplete\n"; glBindFramebuffer(GL_FRAMEBUFFER, 0); width = createInfo.width; height = createInfo.height; } // ... // FBO Creation FramebufferCreateInfo gbufferCI; gbufferCI.colorFormats = gbufferCFs.data(); gbufferCI.depthFormat = FORMAT_DEPTH_32; gbufferCI.numColorTargets = gbufferCFs.size(); gbufferCI.width = engine.settings.resolutionX; gbufferCI.height = engine.settings.resolutionY; gbufferCI.renderPass = nullptr; gbuffer = graphicsWrapper->CreateFramebuffer(gbufferCI); // Bind glBindFramebuffer(GL_DRAW_FRAMEBUFFER, fbo); // Draw here... // Bind to textures glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, textures[0]); glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, textures[1]); glActiveTexture(GL_TEXTURE2); glBindTexture(GL_TEXTURE_2D, textures[2]); glActiveTexture(GL_TEXTURE3); glBindTexture(GL_TEXTURE_2D, depthrenderbuffer); Here is an extract of my code. I can't think of anything else to include. I've really been butting my head into a wall trying to think of a reason but I can think of none and all my research yields nothing. Thanks in advance!
    • By Adrianensis
      Hi everyone, I've shared my 2D Game Engine source code. It's the result of 4 years working on it (and I still continue improving features ) and I want to share with the community. You can see some videos on youtube and some demo gifs on my twitter account.
      This Engine has been developed as End-of-Degree Project and it is coded in Javascript, WebGL and GLSL. The engine is written from scratch.
      This is not a professional engine but it's for learning purposes, so anyone can review the code an learn basis about graphics, physics or game engine architecture. Source code on this GitHub repository.
      I'm available for a good conversation about Game Engine / Graphics Programming
    • By C0dR
      I would like to introduce the first version of my physically based camera rendering library, written in C++, called PhysiCam.
      Physicam is an open source OpenGL C++ library, which provides physically based camera rendering and parameters. It is based on OpenGL and designed to be used as either static library or dynamic library and can be integrated in existing applications.
       
      The following features are implemented:
      Physically based sensor and focal length calculation Autoexposure Manual exposure Lense distortion Bloom (influenced by ISO, Shutter Speed, Sensor type etc.) Bokeh (influenced by Aperture, Sensor type and focal length) Tonemapping  
      You can find the repository at https://github.com/0x2A/physicam
       
      I would be happy about feedback, suggestions or contributions.

  • Popular Now