Advertisement Jump to content
  • Advertisement

Aqua Costa

  • Content Count

  • Joined

  • Last visited

Community Reputation

3702 Excellent

About Aqua Costa

  • Rank

Personal Information


  • Twitter

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. Aqua Costa

    Compact YCoCg Frame Buffer

    You can download a demo with source code from the paper website which includes code on how to unpack the buffer. The most simple method is to sample the current pixel to get the luminance (Y) and one of the chroma values, and then sample one of the neighbour pixels to get the other chrome value: uint2 coord_neighbour = coord; coord_neighbour.x += 1; float4 sample0 = texture0.Load(uint3(coord, 0)); float4 sample1 = texture0.Load(uint3(coord_neighbour, 0)); float y = sample0.r; float co = sample0.g; float cg = sample1.g; // switch which chroma value is Co/Cg based on pixel position if((coord.x & 1) == (coord.y & 1)) { co = sample1.g; cg = sample0.g; } float3 col = YCoCg2RGB(y, co, cg); You'll probably want to use one of the more complex reconstruction methods from the paper (check the source code) but this gives you the basic idea.
  2. Aqua Costa

    Geometry Clipmaps - sample heightmap?

    The main goal of Geometry Clipmaps is enabling you to draw terrains with higher resolution than you can fit in GPU memory (and still have space for the rest of the game assets). Example: The Witcher 3 height map is 23552x23552 = ~1GB (0.37m resolution, I think there's a typo in the presentation). Clearly too much memory if you want other assets in your game.   You *probably* don't need full resolution height data to draw distant mountains, so you can use a clipmap.   (source) The top layer in blue is the full res height map. The blue layers below it are the mips (each half res of the previous one).   But you only load into GPU memory the green areas (centered around the camera).   Continuing using The Witcher 3 as example: It uses 5 clip maps each with resolution 1024x1024 (eg: texture array). 1st layer - Full res - 1024 * 0.37m = 378m around the camera (in each direction) 2nd later - Half res - 1024 * 0.74m = 757m around the camera. (0.74m because it's half res so each pixel corresponds to double the distance) ... 5th layer - 1/16 res - 1024 * 5.92m ~ 6km around the camera   Full map is 23552 * 0.37 ~8.7km, so they're able to draw most of the map using only 1024*1024*5*2 bytes ~ 10 Mb height data.   Since you have all the data you need in the clip map, you don't need as many vertices as the size of the heightmap to render the terrain. Just create a 16x16 patch of vertices (15x15 rectangles), don't need any uvs, and reuse it to draw the terrain. In the vertex shader calculate the world position of the current vertex and use that to sample the clipmap in the correct position/layer. Since you only have full res height data close to the camera, as soon as you start to use different layers of the clipmap render patches doubling the distance between vertices so each vertex matches one pixel of terrain data.   You'll run into problems in the borders when you start to render patches using a different layer because the height data won't match perfectly. This GPU Gems article explains the types of patches to use and how to hide the seams between patches with different levels of detail.   When the camera moves (enough) you need to update some layers of the texture clipmap with data from the full map texture you have on disk. Doing toroidal access allows you to only update parts of the texture instead of having to move the parts that are still relevant over old parts and fill the "empty" space with new height data.   Some useful links: (good solution for seams between layers, I use a modified version of this in my demos and it works very well) link to full source code in the end of paper)
  3. Aqua Costa

    D3D12 Best Practices

      You will still need a per frame constant buffer resource. You just won't need a per frame entry in a descriptor table for that CBV.   The only way to not need a constant buffer resources is to store constants directly inside the root signature, but you have very limited root signature memory, so you won't be able to store everything in the root signature.
  4. You can use ID3D11ShaderReflection::GetThreadGroupSize to get the thread group size of a compute shader in C++.   I do it in my engine right after creating the compute shader and store the thread group size together with the compute shader pointer in a struct, so I always have access to the info when I want the use the shader. 
  5. You haven't described anything new  but if you came up with that on your own, great!   Many engines and games use lightmaps, just google a bit.   You also, basically, described reflective shadow maps (also here).   So yes, it will work but you will have to deal with the limitations. Eg: How will you handle reflections?
  6. Aqua Costa

    Uses for unordered access views?

    There are many possible uses:   1 - Update particle systems on the GPU. Read from previous frame state (ConsumeBuffer), run simulation for the particle and append to the current frame (AppendBuffer). Draw using draw indirect. You can also do this in the geometry shaders (try both and see which is faster)      This also works for other types of geometry like vegetation, water simulation, finding pixels that need bokeh, cloth simulations etc.   2 - Calculate per tile light lists (forward+ rendering) or tiled deferred lighting (tile culling and lighting runs on compute shaders). The results are stored in UAV that are then read as SRVs.   3 - Some full screen effects might benefit from compute shaders groupshared memory, so you'll use UAVs.   4 - and many more, just google a bit.   Check out this GDC presentation.
  7. This is how I would do it:   The CollisionComponent should only contain data related to the physics system (is this a rigid body, a trigger region?, friction, restitution, etc)   PhysicSystem: does all physics related math and outputs a list of collisions that occurred. struct Collision {     handle object_a;     handle object_b;     //.... other collision info }   CollisionHandlerSystem (this is a high level game specific system): Takes the list of collisions (from above) as input and does the necessary processing depending on the types of objects involved in the collisions. This is where you would handle the logic you mentioned. This system could generate a list of messages that could be consumed by other systems, like "destroy entity X", "damage entity Y", etc (good for multi threading) or talk directly with other systems (careful with race conditions).   You could also create another component (CollisionCallbackComponent) and system (or the CollisionHandlerSystem could do it) to call entity specific collision callbacks.   This way you could efficiently handle global game logic, and still provide support of custom callbacks.
  8. I don't think there is any up-to-date book on how to manage "scene graph, mesh, animation classes, etc".   -There's a chapter in GPU Pro 3 (Chapter V.4 Designing a Data-Driven Renderer) that might be helpful (it's short). -Game Engine Architecture chapter on Animation is really good (including some actual code) and should be enough as a start of an engine animation system. -If you want to learn rendering techniques start by the (free) GPU Gems books.   Regarding "scene graph management, etc", you'll be better served by reading other engine source code, the hundreds of topics about this in the forums and conference presentations (check GDC Vault, Frostbite, and other engine/companies websites).   There's a recent presentation about Destiny’s Multi-threaded Renderer Architecture. It probably isn't easy for a beginner to understand everything but read it carefully (and multiple times).   Horde3D source code is small and easy to follow (not the best architecture for current gen but it's a good start).   Google "Data Oriented Programming" and read the Bitsquid blog   More useful link here
  9. Aqua Costa

    Want to create a cloud in 3d array

      This will provide good control over the overall shape and position of the clouds while keeping a more realistic random appearance.   Same technique used to generate explosions (slide 57), also this video.
  10. Aqua Costa

    Want to create a cloud in 3d array

    EDIT: Turns out I misread the question. Check answers below   You can render it using Volume Rendering techniques.   IMO, the easiest way is to put that data in a 3D texture and ray march in the pixel shader. For each pixel, compute the pixel's view direction and sample the 3D texture at N steps from the near to the far plane.   You can also convert the 3D array into a distance field to improve performance, etc.     Another method
  11. The only book I know that (kind of) covers this topic is GPU Pro 3 (Chapter V.4 Designing a Data-Driven Renderer).   There's also a lot of topics on about this (look for Hodgman replies):   Link 1 (you should probably read this one carefully) Link 2 Link 3 Link 4 Link 5 Link 6 Link 7 Link 8 Link 9 Link 10 Link 11 Link 12 Link 13   You should find lots of info in those links.
  12.   It can be useful for some effects like unbird mentioned. Example: Rendering light bounding volumes in Deferred rendering. You bind a read-only DSV to enable depth testing and a SRV of the same depth texture that the shader will sample to reconstruct the pixels position.
  13. Aqua Costa

    Going out of bounds in a rwtexture

    Out of bounds UAV reads return 0s and writes result in No-Ops(nothing being written). More info here (slide 16)   So you shouldn't run into any problems.   I think the only case you have to be careful is appending to a AppendStructuredBuffer because out of bounds doesn't write but still increases the internal counter.
  14. You're incorrectly specifying the 'defines', the command should have a /D NAME=VALUE for each define (the =VALUE part is optional)   So it should be: fxc /T ps_3_0 /E VS_function /Fo comptst.fxc /D AMBIENT_HEMI /D FOGGING /D USENORMALMAP /D SPECULAR CR_ushader_v1.fx If your .fx file contains a vertex shader and a pixel shader you have to compile it twice, once with the target vs_3_0 and the entry point 'your_vs_function_name', and a second time with ps_3_0 and entry point 'your_ps_fuction_name'   That's why your example works with vs_3_0 and not with ps_3_0, you're not setting the correct entry point name for the pixel shader.
  15.   Typically mapped memory is uncached, and so writes will bypass the cache completely. For these cases write combining is used to batch memory accesses.     So I'm assuming that the D3D11 advice of never reading from mapped memory using the CPU (unless it's read-back memory) still applies.   Is this true for all types of GPU accessible memory? Eg: will it be slow to read a descriptor from a descriptor heap using the CPU?   According to Game Engine Architecture, the PS4 GPU can access memory via a cache coherent bus, is something like this available on PC?
  • Advertisement

Important Information

By using, you agree to our community Guidelines, Terms of Use, and Privacy Policy. is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!