Advertisement Jump to content
  • Advertisement


  • Content Count

  • Joined

  • Last visited

Community Reputation

580 Good

About hannesp

  • Rank
  1. If you dont want cascades at all, your question is only about multiple lights and Shadow maps? If i got that right, the answer is simple: Render a Shadow map for each light. You can User array textures or a dedicated Texture per light. In your lighting step, use the corresponding Shadow map when iterating over your lights. Where exactly lies your Problem?
  2. Why isn't anybody here talking about using subroutines (OpenGL, don't know if there's an equivalent in other APIs)? I think this approach could easily expand to per-object-shaders - more flexible than material types, easier to implement than the naughty dog approach. Would be nice if anyone has experience to share.
  3. hannesp

    Rendering Architecture

    The most simple approach in your case would be to add a set of supported "render types" and a visibility check. Create a enum and let each of your renderables return an enum. Maybe you can wire the way how to render something and the enum values together. You could then group your renderables by render type enum value and render in batches. Would be easy to integrate and to add renderable types. Would not be suitable if you have a strong demand to override the behaviour per renderable somehow.. same for visibility determination. Before you call "render" on something, check ifVisible(Camera) and you have frustum culling implemented. Would be more complicated if you want other kinds of culling though.   Besides my advice, my experience: Nowadays, everything is about batched rendering and/or indirect drawing to avoid gpu state changes and to not waste your cpu time for rendering. If you switch from many buffers (ie per renderable) to a global vertex index buffer combination, you could bind ressources once and fire render commands that only contain some offset values for buffer access. You don't need to bother about architecture anymore, because at a fixed step, or as often as possible, you take a list of game objects, split them in opaque and non-opaque, update gpu buffers and render them. Assuming you are using deferred rendering, as most of the engines seem to do, you are limited by the kind of materials, which you can tackle with subroutines (OpenGL) for example. Of material type enum, less flexible.
  4. hannesp

    [SOLVED]Render Thread vs Task System

    Hm....I don't think you have to sacrafice your "sequential order" for any reason. I assume that you keep standing with a "queue" that is somehow used to issue render commands, or actions that require a gpu context (that is not really shared between threads). Keep this queue and let a thread poll this queue at maximum frequence. In your update thread, you can issue tasks....a task itself has to be able to isse another task when it is finished...think of it as event based programming: Your command queue gets 5 commands, when the first is finished, a sixth command is appended in the queue.... and all commands are issued in order. No problem at all.   This approach would require you to have at least two worker pools....a single pool for the "gpu enabled" thread, and a pool for arbitrary tasks. Thats what one would usually do, i guess. Works in my own engine like a charm. As for temorary ressource creation: Don't do it at all. Initialize your component once and keep it until you don't use it any more. Having that said, I'm breaking this rule too, for example for debugging and profiling purpouses and it's no problem at all if you use the described mechanism.
  5. Sorry for the long waiting time.   The implementation is not that difficult. There's the old method: Render your geometry with paraboloid projection...that means do the transformation in your vertex shader. As you might know, (linear) interpolation will be done for all pixels, covered by a triangle, based on their position on the triangle. That's nice, but is incorrect if your projection is not a linear. And paraboloid projection is not linear. Imagine you have two vertices of your triangle and project them onto a sphere somehow. If you want your attributes for a non-existent vertex in the middle between your two vertices, you can't linearly interpolate between them, because you won't end up with a position on the sphere's surface, but somewhere in the sphere.   The problematic step (for simple meshes with unsufficient tesselation), is the interpolation (of your positions). This step can be deferred and be done in the fragment shader. But then, you would have to send world space coordinates to your pixel shader, while the transformation happens during your lighting phase. Take a look at and and , to get more details :)
  6. Unprofessional opinion, based on my experience after implementing exactly THIS: It's not worth it. I wasn't able to get over the artifacts, even though some papers claim it would be possible. I got the impression, that dpm depend on the tesselation of your geometry too much. If you have gl_Layer available as vertex shader ouput, use layered rendering to a cubemap target, and perhaps use a lower cubemap resolution. Otherwise, do 6-pass rendering to each single cubemap side with per-face-culling. Still better results, I bet :)
  7. Hm.... and it would be okay to use sparse textures, but not bindless textures? I don't know for sure, but sparse texture extensions came after bindless textures. And if you ask me, it could be problematic to use ARB_sparse_texture without EXT_sparse_texture2... because without the second extension, you have no chance to figure out whether a texture region is resident or not in your shader....effectively making the sampling cause "undefined behaviour" or returning black color or something. Please correct me if I'm wrong. If I didn't overlook something, sparse_texture2 wasn't even available on my GTX 770 desktop you would rather end up using array textures solely.
  8.   You're right, there's one missing thing - you forgot bindless textures. Bindless textures are a concept different from sparse textures - while sparse/virtual textures have potentially large textures that is only partly allocated, bindless textures will give you the same advantage as array textures. Basically, you have a texture and generate a global handle that you can use everywhere in your shaders without having to bind the texture. Before that, the texture is made resident. In your case you could generate a texture handle, save it in your global material buffer and afterwards referencing it via your material attributes, or write it to a texture of your gbuffer (not recommended). Instead of indexing into an array texture, where you can only have textures with the same size and format, you can use the handle directly and sample with the given uvs from the texture in a deferred way.   EDIT: To be clear, you could use sparse textures in combination with this approach. Even it doesn't make that much sense at first sight, you could make every single texture a sparse one. Virtual textures are nowadays mostly used for very large textures...for example you could combine all surface textures of a scene into a large one and make it sparse, with loading/unloading small pieces of it at runtime. But this brings other problems you don't want to have. I would recommend you to use bindless textures at first, I used it, it's simple to integrate if your hardware supports it and it gets the job done 100%. Performance overhead wouldnt matter that much, as I experienced similar experience as with array textures.
  9.   Very impressive result so far, looks very nice, way better than vct in my own engine :)   My first attempt was to simply do forward shading during voxelization. Why do you use RSM - do you have them already calculated, or do you plan to support more lightsources? I think you will have a performance loss when ditching RSM in favor to forward rendering, but the quality/resolution will be the best you can get for your grid resolution, I think.   Can't help you with your other question though. Do you have somehow a comparison between anisotropic and isometric voxels? Would be interested to see the difference it makes in your engine.
  10. Yea, you're right, accumulation would be faster, will also be my next approach if I find the time. Do you have any experience with or information about the quality of accumulation versus the quality of multisampling based coverage, regarding the sampling of higher mipmap levels? Since I'm okay with my current grid resolution (mipmap 0) in terms of popping, I would be interested how proper coverage calculation adds up after filtering.
  11. Thanks for clearing up JoeJ. Do you talk from experience, did you implement voxelization with multisampling enabled?   Of course I don't want to give "the only true answer" - because there are already so many possible solutions for all the downsides of vct. From my experience, vct is so ressource-hungry, that you want to try the cheapest solutions first. The cheapest solution would probably be to use this approach with the moving average: (take a look at page 310). What JoeJ described in terms of vector packing, is given in glsl code in this article.
  12.   Yes. Just like a traditional way of svo I used the diffuse tex transparency for the voxel opacity, and I also implemented the 6-face aniso-mipmaping. But the cone tracing result is still not good enough... I already realized that I should use associated color (pre-alpha-multipled) to do these works, but I don't know what is the exactly meaning of 'alpha-blend multiple fragments that fall on one voxel'. Would you kindly enough to show me more details? :)   I also notice that tomorrow's children used a 6-face surface voxelization method, which considered the mesh itself as a aniso-voxelized (instead of iso-voxelized) volume. They also pointed that the solid voxelization would be greatly helpful for the cone tracing. Is there anybody has related experiences about this way?     Yes, I do realized that maybe there exists some relationships between the coverage of MS and the real occupancy of the voxel. But the question is: HOW to implement this? In fact, I have used the 8x multi-sampling for my GPU voxelization implementation. But until now I have not work out an effective way to induce the correct opacity (density) of voxels from its 2D sub-pixel coverage.   In fact, what I really concern is TO IMPROVE the quality of voxel cone tracing. If a traditional gpu voxelization is all the best that I can do, I would go with it. Maybe I should focus on  the cone tracing instead? How do you think? :ph34r:     Like with traditional rendering, you can have multiple objects that cover shared pixels in the end on the screen. In the case of vct, you end up having a voxel that is occupied by more than one objects - so which object's color do you store in the voxel? The answer is, that you store both. You have to take account of their coverage of the voxel, or at least do a uniform distribution for all objects falling into one voxel. That means if two objects fall into one voxel, the voxel volor is objecta.rgb*objecta.a + objectb.rgb*objectb.a - and afterwards normalizing the result. Since alpha blending would need information about the current state of your voxel texture in the fragment shader, you would sadly have to sample the texture you are currently rendering to - which is not possible. If you use other instructions, like imageStore, you could do a moving average, for example take a look at this: . Please tell me, if you need further advice.
  13. Isn't the opacity determined by your geometry's opacity?? I'm simply using my transparency from the diffuse texture (the alpha component) and it works well for voxelization. Don't forget that you have to take account on the opacity during filtering later on as well.   EDIT: And don't forget that you have to alphablend multiple fragments that fall on one voxel as well somehow.
  14. I can (probably?) only answer one question of your post:   * Yes, having indices in one buffer is better, if it fits in your engine. That means if you don't have any state you have to set per object, it's best to use one vertex buffer and one index buffer, since you wouldn't have to bind buffers between draw calls. This principle is totally independent from everything else, since it heads towards minimizing driver overhead. If you do have different state to set, you would have to use drawIndirect, but I don't know much about it and have no experience with it. But it should work better when having a single index buffer too, compared to using multiple index buffers.   I would guess if you heavily use instancing, you are more likely to run out of processing power on the gpu side, rather than you would  face a problem with index buffer binding. I have only used isntancing with multiple index buffers as well as multiple vertex buffers. Since I had a very huge benefit from using instancing, my answer would be: Take your engine as it is for now, use instancing and decide if you are satisfied with the result - if not, try to optimize. Using a single buffer for everything can also have downsides - for example when you have to update it with data that doesn't fit in the given region anymore and so on.
  15. hannesp

    Tessellation On The Gpu

    The short answer: Yes, it's possible.   The explanation: You need a way to (comfortably) get back the data from the GPU to the CPU - if you can rely on modern APIs, that usually means a buffer, like a shader storage buffer in OpenGL, where you can write to from all of your shaders (geometry included, I think). Or transform feedback, but nowadays it's kind of outdated if you could use ssbos.   The question is: Does it make sense? The pipeline needs some sort of.... synchronistation between GPU and CPU. Not to talk about that you have to transfer your "modeling actions" somehow to the GPU. And if one action affects multiple vertices at once, things start to get complicated. There's a concept called "pinned memory" (OpenGL world calls it persistent mapped buffers), where you represent your vertex buffers in persistent memory and do synchronization by yourself. With this concept, you could easily work on your native buffers on the CPU side and synchronize before you draw - easiest, probably fastest way I could imagine.
  • Advertisement

Important Information

By using, you agree to our community Guidelines, Terms of Use, and Privacy Policy. is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!