Sign in to follow this  
mrheisenberg

OpenGL Is Clustered Forward Shading worth implementing?

Recommended Posts

I'm referring to this: http://www.cse.chalmers.se/~uffe/clustered_shading_preprint.pdf there is also a video avaliable http://www.youtube.com/watch?v=6DyTk7917ZI the performance of this technique seems to scale perfectly for huge amounts of lights,but on lower amounts performs a little worse than the less advanced tiled culling method.The thing is - has there ever been a case where you will need 30 thousand lights in a scene?Plus,won't it get bottlenecked by generating shadow maps for all the lights(in the youtube video the lights just pass trough the bridge and under it).Unfortunately I couldn't test it's performance,because for some reason the provided demo won't start up(even tho I support OpenGL 3 and higher) and I've never done GLSL,so it might take time to get it to work.

Share this post


Link to post
Share on other sites

One thing I like about the tiled Clustered is that it become "cheaper" to handle transparent object. In the case of tiled deferred you have to build 2 lists, one that used the depth buffer for the light culling and one without. So yo can have a massive overhead on the transparent pass. With clustered 1 culling is necessary.

 

But again that depend of the light count (also clustered is a heavier in term of memory size if I'm correct) and the scene.

 

Also at the Siggraph Asia , they were a presentation about a 2.5D culling techinque that you can find here : https://sites.google.com/site/takahiroharada/

Share this post


Link to post
Share on other sites
you can also try to sort front to back instead, if you are vertex bound, that might give you better results. another approach is to use occluder object, you can get 90% of the culling as with zprepass, yet without the cost.

but tesselated geometry has another problem, you cover a lot of pixel just partially when AA is enabled, that increases the costs a lot in the pixelshader. something like POM might scale way better.

Share this post


Link to post
Share on other sites
deferred shading is really unhandy when it comes to anti aliasing and lighting transparent objects is not solved in this approach.

forward shading is the way to go, I expect in the next generation consoles to go back to it. I use a similar approach on my phone engines, I've a view space aligned 3d grid (texture) that has a 'count' and 'offset' value per voxel, that I use to index into a texture containing the light sources that affect that voxel. the grid creation is done every frame on CPU, I don't have 30k of lights, but I run with antialiasing, I use the same shader for solid and transparent objects, very convenient to use, I can even assign this texture on the vertexshader for lighting particles in a cheap way.

 

one problem you still have is to apply shadows/projectors, it's solveable by having an atlas and store more data per lightsource (projection matrix, offsets,extends etc), but it makes quite a lot of overhead.

 

Many have solved transparency with deferred, Epic and Avalanche among them. Anti Aliasing is also doable. Multiple BRDF's are handled straightforward in deferred. You also have direct access to all those buffers should you need anything, and don't have to worry about processing and pixels you can't see it. And most modern hardware, including the 4th Gen Ipad and Tegra 4 from what I've heard, have enough bandwidth and memory to get some sort of deferred done, though if you're doing thousand and thousands of lights mobile probably isn't your target platform anyway.

 

I'd rather make sure there's not any unnecessary shading going on. Of course you can't do 8xMSAA with deferred, at least not cheaply, but you can do something like SMAA, which looks just as good and is cheaper in any case. I suppose it's all based on what you'd like to be doing. If you've got the time for it, and are on the right platform (new consoles, high end pc stuff) then I don't see any reason not to go deferred. If you don't have the time to solve all those problems, or somethings I'm probably not even thinking of, then forward might be your solution. But calling out all the old problems with deferred isn't relevant, as they've been solved for most part.

Share this post


Link to post
Share on other sites
Many have solved transparency with deferred, Epic and Avalanche among them. Anti Aliasing is also doable. Multiple BRDF's are handled straightforward in deferred. You also have direct access to all those buffers should you need anything, and don't have to worry about processing and pixels you can't see it. And most modern hardware, including the 4th Gen Ipad and Tegra 4 from what I've heard, have enough bandwidth and memory to get some sort of deferred done, though if you're doing thousand and thousands of lights mobile probably isn't your target platform anyway.

I don't remember Avalanche using Deferred Shading in it's titles. Which titles do use it?

 

Handling transparency... nice way of saying "solved". Switching to forward is not a "solution", neither is using lighting accumulative aproaches. It's a workaround. Anti aliasing is doable, but at a gigantic cost. I'm talking about MSAA and CSAA (SSAA is always expensive). Not about "FXAA" & Co. which is a cheap trick.

As for multiple BRDFs, it's not straightforward in deferred. It needs an extra cost in the MRT to store material ID, and you either use branching in your code and pray for high branch coherency (low frequency image) to get the best BRDFs (Cook Torrance, Oren Nayar, Phong, Blinn Phong, Strauss, etc) at decent speed, or resort to texture array approaches (which produce very interesting/creative results that I love, but aren't optimal for those seeking photorealism).

 

So, no, I wouldn't call the old deferred problems as "solved".

Share this post


Link to post
Share on other sites
I don't see any reason not to go deferred

Forward vs Deferred arguments are silly and useless out of context, because different games are better suited to different pipelines. There is no one-pipeline-to-rule-them-all, and as a side-rant: any engine that lists "deferred shading" on it's feature list is missing the point (an engine should give you the tools to build different pipelines, and a deferred rendering pipe should be in the engine samples/examples, not the core).

 

There's still many games shipping today that use "traditional forward" rendering, and almost every game is a hybrid, where some calculations are deferred and others aren't.
Choosing where to put calculations in your graphics pipeline is an optimization problem, which means it's unsolvable except in the context of your particular data.

 

e.g. on my last game, we calculated shadow data in screen-space for some objects (Deferred Shadow Maps), and also used deferred decals, then forward rendered everything, then calculated shadow data in screen-space for some other objects, then applied these 2nd shadow results to the forward-rendered lighting data to get the final lighting buffer.

That's not traditional forward or deferred rendering. Vanilla doesn't work for most games.

 

Note that Forward+ (aka Clustered Forward, Light Indexed Deferred) is a very new topic and there's a lot of research coming up this year.

The original version (light-indexed deferred) has actually been around for 5 years or so, and is even very easy to implement on DX9! However, DX11 has made these kinds of forward renderers easier and more efficient to implement with less restrictions too, so the idea is making a big comeback wink.png

Edited by Hodgman

Share this post


Link to post
Share on other sites

the reason a lot of games went deferred is that it's not possible on current consoles to go forward. dynamic branching etc. would just kill you, and you don't really have benefits of it as most games are not rendering insane AA resolutions. that might change on future gen, they'll probably be very alike to PCs and there you don't worry about branching, but you want to support high AA resolutions without paying the cost of shading every sub sample.

 

so the question whether you go deferred or forward is also very much dependent on what your hardware has to offer (beside the question of what you're trying to achive).

Share this post


Link to post
Share on other sites

the reason a lot of games went deferred is that it's not possible on current consoles to go forward.

Many current-gen console games are forward, and forward has stuck around because it's very hard to go deferred on current-gen consoles... The amount of bandwidth required kills you. Even 16-bit HDR (64bpp) is a huge burden on these consoles.

the more advanced games are, the more likely they become deferred, the reason is that it's not possible to get the amount of light-surface interactions with forward rendering in a fast way. as you said, it would seem deferred is more demanding, yet it's the only way to go if you want flexibility.

Share this post


Link to post
Share on other sites
Not really; deferred might have solved some problems with regards to lights but it brought with it a whole host of others with regards to memory bandwidth, AA issues, problems integrating different BRDFs, transparency and other issues which required various hoops to be jumped through.

Going forward hybrid solutions are likely to become the norm, such as AMD's Leo demo which mixes deferred aspects with a forward rendering pass to do the real geometry rendering which can get around pretty much all of those problems (but brings its own compromises).

The point is; all rendering has trade offs and you'll find plenty of "advanced" engines which use various rendering methods - hell, the last game I worked on was all forward lit using baked lighting and SH light probes because it was the only way we were going to hit 60fps on the consoles.

Edit: also a good and advanced engine WONT force you to take one rendering path, it will let the game code decide (the engine powering the aforementioned game can support deferred as well as forward at least...) Edited by phantom

Share this post


Link to post
Share on other sites
the more advanced games are, the more likely they become deferred, the reason is that it's not possible to get the amount of light-surface interactions with forward rendering in a fast way. as you said, it would seem deferred is more demanding, yet it's the only way to go if you want flexibility.

 
What's 'advanced' mean? Huge numbers of dynamic lights? You can do just as many lights with forward as long as you've got a decent way of solving the classic issue of determining which objects are affected by which lights. Actually, the whole point of tiled-deferred was that it was trying to reduce lighting bandwidth back down to what we had with forward rendering, while keeping the "which light for which object" calculations in screen-space on the GPU.
 
If your environment is static, then you can bake all the lighting (and probes) and it'll be a ton faster than any other approach! wink.png
Most console games are still using static, baked lighting for most of the scene, which reduces the need for huge dynamic light counts.
 
Another issue with deferred is that it's very hard to do at full 720p on the 360. The 360 only has 10MiB of EDRAM, where your frame-buffers have to live. Let's say you optimize your G-buffer layout so you've got hardware depth/stencil, and two 8888 targets -- that's 3 * 4bpp * 1280*720, or ~10.5MiB -- that's over the limit and won't fit.

n.b. these numbers are the same as depth/stencil + FP16_16_16_16, which also makes forward rendering or deferred light accumulation difficult in HDR... wacko.png 

Sure, Crysis, Battlefield 3 and Killzone are deferred, but there's probably many more games that use forward rendering, even "AAA" games, like Gears of War (and most other Unreal games), L4D2 (and other Source games), God of War, etc... Then there's the games that have gone deferred-lighting (LPP) as a half-way choice, such as GTA4 (or many rockstar games), Space Marine, etc...
 
Regarding materials, forward is unarguably more flexible -- each object can have unique BRDFs, unique lighting models, and any number of lights. It's just inefficient if you've got lots of small objects (due to shader swapping overhead and bad quad efficiency), or lots of big objects (due to the "which light for which object" calculations being done per-object).
Actually, you mentioned dynamic branches before, but forward rendering doesn't need any; all branches should be able to be determined at compile time. On the other hand, implementing multiple BRDFs in a deferred renderer requires some form of branching (or look-up-tables, which are just as bad).
 
Also, tiled-deferred and tiled-forward are implementable on current-gen hardware (even DX9 PC if you're careful), so there's no reason we won't see it soon wink.png

As usual, there's no single objectively better pipeline; different games have different requirements, which are more efficiently met with one pipeline or another...

Edited by Hodgman

Share this post


Link to post
Share on other sites
A little off topic but still on topic, does anyone have any links to good tutorials on deferred vs forward rendering? I've read a fair bit about the detail on deferred but would rather get a good grounding on it before look into it further - couldn't find any decent sites with 'why deferred' other than 'you can have more lights'.

Apologies for borrowing this thread quickly...

Share this post


Link to post
Share on other sites
Not really; deferred might have solved some problems with regards to lights but it brought with it a whole host of others with regards to memory bandwidth, AA issues, problems integrating different BRDFs, transparency and other issues which required various hoops to be jumped through.

exactly, one would think, having no MSAA (for shading), no solution for alphablend, problems with getting different BRDFs running, high memory storage and bandwidth cost, why on earth would anyone do that.

simply because the current gen console hardware does not offer another solution to create worlds that player, designer and artist expect, where you have tons of dynamic lights, where even particles light the close-by geometry.

Share this post


Link to post
Share on other sites
the more advanced games are, the more likely they become deferred, the reason is that it's not possible to get the amount of light-surface interactions with forward rendering in a fast way. as you said, it would seem deferred is more demanding, yet it's the only way to go if you want flexibility.

 
What's 'advanced' mean? Huge numbers of dynamic lights? You can do just as many lights with forward as long as you've got a decent way of solving the classic issue of determining which objects are affected by which lights. Actually, the whole point of tiled-deferred was that it was trying to reduce lighting bandwidth back down to what we had with forward rendering, while keeping the "which light for which object" calculations in screen-space on the GPU.

advanced means there are no limits in light-surface interactions due to tech. deferred shading has a lot of 'points', not just this one.

-you had to reduce shader combination counts, you can imagin, even if your forward solution would be fast enough, you could have 0 to 100 lights affecting a surface, this means you need 100 times the permutation of your shader library that isn't small already.  (and no, sadly dynamic branching is not a solution on current gen HW, and no even static branching is not a solution, as your shader will increase be some % and your register usage will increase as well, and we graphics coder guys don't want to pay those ms that we could spend elsewhere. yes, it's a performance reason)

-complexity of light resources, there are some simple lights, some area lights, some projector light, some shadow-mapping lights, there is a sun, there are light streaks (e.g. particle, laser beams). if you'd want to go forward, you'd need to index into all the needed resources, like textures, constants, and current gen hw is not really supporting that. creating atlases is also not very feasible, you'd need to spend a lot of time on moving memory to re-arange data per object to draw. (and you'd still face tight limits on current gen).

 

you can find some more reasons people went deferred in:

http://www.crytek.com/download/A_bit_more_deferred_-_CryEngine3.ppt

 

 

 

 

 

 

 

If your environment is static, then you can bake all the lighting (and probes) and it'll be a ton faster than any other approach! wink.png
Most console games are still using static, baked lighting for most of the scene, which reduces the need for huge dynamic light counts.

and even those engines, that decimate a vast count of lights this way, like UE3 using lightmass, have problems to apply those lights to dynamic objects, in UE3 they use spherical harmonics to combine them, just like KZ2 does for baked lights. lightmaps are really just orthogonal to forward/deferred.

http://www.unrealengine.com/files/downloads/GDC09_Smedberg_RenderingTechniques.pdf

 

 

AFAIK those realtime shadows in UE3 are claimed to be deferred, as that's the only reason why UE3 does not cope well with MSAA.

 

 

 

 

Another issue with deferred is that it's very hard to do at full 720p on the 360. The 360 only has 10MiB of EDRAM, where your frame-buffers have to live. Let's say you optimize your G-buffer layout so you've got hardware depth/stencil, and two 8888 targets -- that's 3 * 4bpp * 1280*720, or ~10.5MiB -- that's over the limit and won't fit.

n.b. these numbers are the same as depth/stencil + FP16_16_16_16, which also makes forward rendering or deferred light accumulation difficult in HDR... wacko.png

exactly, yet another reason why it is a very unfavorable idea to go deferred on 360. why would anyone do that? it's because the alternative just does not work (for the reasons given above). Sure, if you make a racing game like gran turismo, with just one light source and maybe some spherical harmonics evaluation in the VS for nicer ambient/radiosity, no reason to go deferred. even an outdoor shooter like just caused can life with forward I guess. but as soon as you want more advanced lighting, like GearsOfWar, GTA, Crysis, Stalker, ... you can't go forward on current gen. next gen, I imagin something like AMD did in LEO is very doable.

.

 

 

Sure, Crysis, Battlefield 3 and Killzone are deferred, but there's probably many more games that use forward rendering, even "AAA" games, like Gears of War (and most other Unreal games), L4D2 (and other Source games), God of War, etc... Then there's the games that have gone deferred-lighting (LPP) as a half-way choice, such as GTA4 (or many rockstar games), Space Marine, etc...

Crysis is forward shaded with up to 16lights per object, (check the insane amount of shader space they use ;) ), Crysis 2 is deferred lighted like GTA, UE3 games are neither what we would call deferred nor forward, it's spherical harmonic based like KZ2. battlefield 3 goes for the (deferred) light indexing/tiling approach. as it's not doable on the RSX it seems, they rather spend their SPUs for it, yet it's the first step towards light indexing, IMO.

 

 

 

Regarding materials, forward is unarguably more flexible -- each object can have unique BRDFs, unique lighting models, and any number of lights. It's just inefficient if you've got lots of small objects (due to shader swapping overhead and bad quad efficiency), or lots of big objects (due to the "which light for which object" calculations being done per-object).

that's the vanilla version, and then the clustered/tiled forward shading comes in ;)

 

Actually, you mentioned dynamic branches before, but forward rendering doesn't need any; all branches should be able to be determined at compile time. On the other hand, implementing multiple BRDFs in a deferred renderer requires some form of branching (or look-up-tables, which are just as bad).

would explain why most deferred games on console have just one lighting term, even the nano suit in Crysis2 looks like it's missing the anisotropic metal shading of crysis1.

the dynamic branching is needed in first place to skip unneeded light calculations. if you are backfacing, or in shadow, or out of range -> next light. this gives even on my mobile phones a boost if I use a fixed set of lights per drawn object. on DX9 hardware it was skipping pixel, but the general overhead due to this branching compensated for it (was like 10cycles more per shader, 6due to branching and some more as the loop had overhead of storing/restoring registers, validated with FX composer back then.)

 

 

Also, tiled-deferred and tiled-forward are implementable on current-gen hardware (even DX9 PC if you're careful), so there's no reason we won't see it soon wink.png

As usual, there's no single objectively better pipeline; different games have different requirements, which are more efficiently met with one pipeline or another...

I'm just saying, going for top notch lighting/shading (aka not just radiosity baking into lightmaps and also not just 1light source in the world and cubemap/spherical harmonics for dynamic objects), made all engines go deferred on this generation of consoles. I can't think of any with competitive lighting to dead space, crysis,gta, that would be forward, beside maybe God Of War, but you could clearly identify artifacts of merged lights per vertex if you exceeded some count (I'd guess 3 dynamic lights).

Share this post


Link to post
Share on other sites
A little off topic but still on topic, does anyone have any links to good tutorials on deferred vs forward rendering? I've read a fair bit about the detail on deferred but would rather get a good grounding on it before look into it further - couldn't find any decent sites with 'why deferred' other than 'you can have more lights'.

Apologies for borrowing this thread quickly...

I think that's a good start:

http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter09.html

Share this post


Link to post
Share on other sites
A little off topic but still on topic, does anyone have any links to good tutorials on deferred vs forward rendering? I've read a fair bit about the detail on deferred but would rather get a good grounding on it before look into it further - couldn't find any decent sites with 'why deferred' other than 'you can have more lights'.

Apologies for borrowing this thread quickly...

I think that's a good start:

http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter09.html

 

That link just reinforces his belief that 'why deferred' is just 'you can have more lights'.

Effectively, that's the main reason it appeared, and that's the main reason it's still strong.

 

There are other side effects that are good:

  1. The GBuffer data can be very useful for screen space effects (i.e. Normals can be used for AO, refraction mapping, and local reflections, depth can be used for Godrays, fog, and DOF). Even if you do you forward rendering, you'll probably end up spitting a sort of GBuffer for those FXs. Of course, you don't have to do magic to compress a lot of parameters into the MRT that you won't be needing in the postprocessing passes (like specular colour term).
  2. Shading complexity becomes screen-dependant. This benefit/disadvantage (depending on the application) is shared with Forward+. Assuming just one directional light is used, every pixel is shaded once. In a forward renderer, if you render everything back to front, every pixel covered by a triangle will be shaded multiple times. Hence deferred shader's time will be fixed and depends on screen resolution (hence lower screen res. is an instant win for low end users). A deferred shader/Forward+ cannot shade more than (num_lights * width * height) pixels even if there are an infinite amount of triangles, whereas the Forward renderer may shade the same pixel an infinite number of times for an infinite amount of triangles, overwriting it's previous value. Of course if you're very good at sorting your triangles (chances are the game cannot be that good) Forward renderer may perform faster; but in a Deferred Shader you're on more stable grounds.

Edit: As for the "more lights" argument, take in mind that a deferred shader can easily take 5000 lights (as long as they're small) while a forward renderer can max at 8-16 lights per object.

Edited by Matias Goldberg

Share this post


Link to post
Share on other sites
Very insightful guys, thanks. My renderer is nicely abstracted so I might give it a go. My game only requires one directional light at the moment but I still see the plus with effects like AO, etc

Anyone know which method the call of duty engines use?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Announcements

  • Forum Statistics

    • Total Topics
      628307
    • Total Posts
      2981972
  • Similar Content

    • By mellinoe
      Hi all,
      First time poster here, although I've been reading posts here for quite a while. This place has been invaluable for learning graphics programming -- thanks for a great resource!
      Right now, I'm working on a graphics abstraction layer for .NET which supports D3D11, Vulkan, and OpenGL at the moment. I have implemented most of my planned features already, and things are working well. Some remaining features that I am planning are Compute Shaders, and some flavor of read-write shader resources. At the moment, my shaders can just get simple read-only access to a uniform (or constant) buffer, a texture, or a sampler. Unfortunately, I'm having a tough time grasping the distinctions between all of the different kinds of read-write resources that are available. In D3D alone, there seem to be 5 or 6 different kinds of resources with similar but different characteristics. On top of that, I get the impression that some of them are more or less "obsoleted" by the newer kinds, and don't have much of a place in modern code. There seem to be a few pivots:
      The data source/destination (buffer or texture) Read-write or read-only Structured or unstructured (?) Ordered vs unordered (?) These are just my observations based on a lot of MSDN and OpenGL doc reading. For my library, I'm not interested in exposing every possibility to the user -- just trying to find a good "middle-ground" that can be represented cleanly across API's which is good enough for common scenarios.
      Can anyone give a sort of "overview" of the different options, and perhaps compare/contrast the concepts between Direct3D, OpenGL, and Vulkan? I'd also be very interested in hearing how other folks have abstracted these concepts in their libraries.
    • By aejt
      I recently started getting into graphics programming (2nd try, first try was many years ago) and I'm working on a 3d rendering engine which I hope to be able to make a 3D game with sooner or later. I have plenty of C++ experience, but not a lot when it comes to graphics, and while it's definitely going much better this time, I'm having trouble figuring out how assets are usually handled by engines.
      I'm not having trouble with handling the GPU resources, but more so with how the resources should be defined and used in the system (materials, models, etc).
      This is my plan now, I've implemented most of it except for the XML parts and factories and those are the ones I'm not sure of at all:
      I have these classes:
      For GPU resources:
      Geometry: holds and manages everything needed to render a geometry: VAO, VBO, EBO. Texture: holds and manages a texture which is loaded into the GPU. Shader: holds and manages a shader which is loaded into the GPU. For assets relying on GPU resources:
      Material: holds a shader resource, multiple texture resources, as well as uniform settings. Mesh: holds a geometry and a material. Model: holds multiple meshes, possibly in a tree structure to more easily support skinning later on? For handling GPU resources:
      ResourceCache<T>: T can be any resource loaded into the GPU. It owns these resources and only hands out handles to them on request (currently string identifiers are used when requesting handles, but all resources are stored in a vector and each handle only contains resource's index in that vector) Resource<T>: The handles given out from ResourceCache. The handles are reference counted and to get the underlying resource you simply deference like with pointers (*handle).  
      And my plan is to define everything into these XML documents to abstract away files:
      Resources.xml for ref-counted GPU resources (geometry, shaders, textures) Resources are assigned names/ids and resource files, and possibly some attributes (what vertex attributes does this geometry have? what vertex attributes does this shader expect? what uniforms does this shader use? and so on) Are reference counted using ResourceCache<T> Assets.xml for assets using the GPU resources (materials, meshes, models) Assets are not reference counted, but they hold handles to ref-counted resources. References the resources defined in Resources.xml by names/ids. The XMLs are loaded into some structure in memory which is then used for loading the resources/assets using factory classes:
      Factory classes for resources:
      For example, a texture factory could contain the texture definitions from the XML containing data about textures in the game, as well as a cache containing all loaded textures. This means it has mappings from each name/id to a file and when asked to load a texture with a name/id, it can look up its path and use a "BinaryLoader" to either load the file and create the resource directly, or asynchronously load the file's data into a queue which then can be read from later to create the resources synchronously in the GL context. These factories only return handles.
      Factory classes for assets:
      Much like for resources, these classes contain the definitions for the assets they can load. For example, with the definition the MaterialFactory will know which shader, textures and possibly uniform a certain material has, and with the help of TextureFactory and ShaderFactory, it can retrieve handles to the resources it needs (Shader + Textures), setup itself from XML data (uniform values), and return a created instance of requested material. These factories return actual instances, not handles (but the instances contain handles).
       
       
      Is this a good or commonly used approach? Is this going to bite me in the ass later on? Are there other more preferable approaches? Is this outside of the scope of a 3d renderer and should be on the engine side? I'd love to receive and kind of advice or suggestions!
      Thanks!
    • By nedondev
      I 'm learning how to create game by using opengl with c/c++ coding, so here is my fist game. In video description also have game contain in Dropbox. May be I will make it better in future.
      Thanks.
    • By Abecederia
      So I've recently started learning some GLSL and now I'm toying with a POM shader. I'm trying to optimize it and notice that it starts having issues at high texture sizes, especially with self-shadowing.
      Now I know POM is expensive either way, but would pulling the heightmap out of the normalmap alpha channel and in it's own 8bit texture make doing all those dozens of texture fetches more cheap? Or is everything in the cache aligned to 32bit anyway? I haven't implemented texture compression yet, I think that would help? But regardless, should there be a performance boost from decoupling the heightmap? I could also keep it in a lower resolution than the normalmap if that would improve performance.
      Any help is much appreciated, please keep in mind I'm somewhat of a newbie. Thanks!
    • By test opty
      Hi,
      I'm trying to learn OpenGL through a website and have proceeded until this page of it. The output is a simple triangle. The problem is the complexity.
      I have read that page several times and tried to analyse the code but I haven't understood the code properly and completely yet. This is the code:
       
      #include <glad/glad.h> #include <GLFW/glfw3.h> #include <C:\Users\Abbasi\Desktop\std_lib_facilities_4.h> using namespace std; //****************************************************************************** void framebuffer_size_callback(GLFWwindow* window, int width, int height); void processInput(GLFWwindow *window); // settings const unsigned int SCR_WIDTH = 800; const unsigned int SCR_HEIGHT = 600; const char *vertexShaderSource = "#version 330 core\n" "layout (location = 0) in vec3 aPos;\n" "void main()\n" "{\n" " gl_Position = vec4(aPos.x, aPos.y, aPos.z, 1.0);\n" "}\0"; const char *fragmentShaderSource = "#version 330 core\n" "out vec4 FragColor;\n" "void main()\n" "{\n" " FragColor = vec4(1.0f, 0.5f, 0.2f, 1.0f);\n" "}\n\0"; //******************************* int main() { // glfw: initialize and configure // ------------------------------ glfwInit(); glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3); glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 3); glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE); // glfw window creation GLFWwindow* window = glfwCreateWindow(SCR_WIDTH, SCR_HEIGHT, "My First Triangle", nullptr, nullptr); if (window == nullptr) { cout << "Failed to create GLFW window" << endl; glfwTerminate(); return -1; } glfwMakeContextCurrent(window); glfwSetFramebufferSizeCallback(window, framebuffer_size_callback); // glad: load all OpenGL function pointers if (!gladLoadGLLoader((GLADloadproc)glfwGetProcAddress)) { cout << "Failed to initialize GLAD" << endl; return -1; } // build and compile our shader program // vertex shader int vertexShader = glCreateShader(GL_VERTEX_SHADER); glShaderSource(vertexShader, 1, &vertexShaderSource, nullptr); glCompileShader(vertexShader); // check for shader compile errors int success; char infoLog[512]; glGetShaderiv(vertexShader, GL_COMPILE_STATUS, &success); if (!success) { glGetShaderInfoLog(vertexShader, 512, nullptr, infoLog); cout << "ERROR::SHADER::VERTEX::COMPILATION_FAILED\n" << infoLog << endl; } // fragment shader int fragmentShader = glCreateShader(GL_FRAGMENT_SHADER); glShaderSource(fragmentShader, 1, &fragmentShaderSource, nullptr); glCompileShader(fragmentShader); // check for shader compile errors glGetShaderiv(fragmentShader, GL_COMPILE_STATUS, &success); if (!success) { glGetShaderInfoLog(fragmentShader, 512, nullptr, infoLog); cout << "ERROR::SHADER::FRAGMENT::COMPILATION_FAILED\n" << infoLog << endl; } // link shaders int shaderProgram = glCreateProgram(); glAttachShader(shaderProgram, vertexShader); glAttachShader(shaderProgram, fragmentShader); glLinkProgram(shaderProgram); // check for linking errors glGetProgramiv(shaderProgram, GL_LINK_STATUS, &success); if (!success) { glGetProgramInfoLog(shaderProgram, 512, nullptr, infoLog); cout << "ERROR::SHADER::PROGRAM::LINKING_FAILED\n" << infoLog << endl; } glDeleteShader(vertexShader); glDeleteShader(fragmentShader); // set up vertex data (and buffer(s)) and configure vertex attributes float vertices[] = { -0.5f, -0.5f, 0.0f, // left 0.5f, -0.5f, 0.0f, // right 0.0f, 0.5f, 0.0f // top }; unsigned int VBO, VAO; glGenVertexArrays(1, &VAO); glGenBuffers(1, &VBO); // bind the Vertex Array Object first, then bind and set vertex buffer(s), //and then configure vertex attributes(s). glBindVertexArray(VAO); glBindBuffer(GL_ARRAY_BUFFER, VBO); glBufferData(GL_ARRAY_BUFFER, sizeof(vertices), vertices, GL_STATIC_DRAW); glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 3 * sizeof(float), (void*)0); glEnableVertexAttribArray(0); // note that this is allowed, the call to glVertexAttribPointer registered VBO // as the vertex attribute's bound vertex buffer object so afterwards we can safely unbind glBindBuffer(GL_ARRAY_BUFFER, 0); // You can unbind the VAO afterwards so other VAO calls won't accidentally // modify this VAO, but this rarely happens. Modifying other // VAOs requires a call to glBindVertexArray anyways so we generally don't unbind // VAOs (nor VBOs) when it's not directly necessary. glBindVertexArray(0); // uncomment this call to draw in wireframe polygons. //glPolygonMode(GL_FRONT_AND_BACK, GL_LINE); // render loop while (!glfwWindowShouldClose(window)) { // input // ----- processInput(window); // render // ------ glClearColor(0.2f, 0.3f, 0.3f, 1.0f); glClear(GL_COLOR_BUFFER_BIT); // draw our first triangle glUseProgram(shaderProgram); glBindVertexArray(VAO); // seeing as we only have a single VAO there's no need to // bind it every time, but we'll do so to keep things a bit more organized glDrawArrays(GL_TRIANGLES, 0, 3); // glBindVertexArray(0); // no need to unbind it every time // glfw: swap buffers and poll IO events (keys pressed/released, mouse moved etc.) glfwSwapBuffers(window); glfwPollEvents(); } // optional: de-allocate all resources once they've outlived their purpose: glDeleteVertexArrays(1, &VAO); glDeleteBuffers(1, &VBO); // glfw: terminate, clearing all previously allocated GLFW resources. glfwTerminate(); return 0; } //************************************************** // process all input: query GLFW whether relevant keys are pressed/released // this frame and react accordingly void processInput(GLFWwindow *window) { if (glfwGetKey(window, GLFW_KEY_ESCAPE) == GLFW_PRESS) glfwSetWindowShouldClose(window, true); } //******************************************************************** // glfw: whenever the window size changed (by OS or user resize) this callback function executes void framebuffer_size_callback(GLFWwindow* window, int width, int height) { // make sure the viewport matches the new window dimensions; note that width and // height will be significantly larger than specified on retina displays. glViewport(0, 0, width, height); } As you see, about 200 lines of complicated code only for a simple triangle. 
      I don't know what parts are necessary for that output. And also, what the correct order of instructions for such an output or programs is, generally. That start point is too complex for a beginner of OpenGL like me and I don't know how to make the issue solved. What are your ideas please? What is the way to figure both the code and the whole program out correctly please?
      I wish I'd read a reference that would teach me OpenGL through a step-by-step method. 
  • Popular Now