# Aqua Costa

Member

1001

3696 Excellent

• Rank
Crossbones+

• Interests
Programming
1. ## Geometry Clipmaps - sample heightmap?

The main goal of Geometry Clipmaps is enabling you to draw terrains with higher resolution than you can fit in GPU memory (and still have space for the rest of the game assets). Example: The Witcher 3 height map is 23552x23552 = ~1GB (0.37m resolution, I think there's a typo in the presentation). Clearly too much memory if you want other assets in your game.   You *probably* don't need full resolution height data to draw distant mountains, so you can use a clipmap.   (source) The top layer in blue is the full res height map. The blue layers below it are the mips (each half res of the previous one).   But you only load into GPU memory the green areas (centered around the camera).   Continuing using The Witcher 3 as example: It uses 5 clip maps each with resolution 1024x1024 (eg: texture array). 1st layer - Full res - 1024 * 0.37m = 378m around the camera (in each direction) 2nd later - Half res - 1024 * 0.74m = 757m around the camera. (0.74m because it's half res so each pixel corresponds to double the distance) ... 5th layer - 1/16 res - 1024 * 5.92m ~ 6km around the camera   Full map is 23552 * 0.37 ~8.7km, so they're able to draw most of the map using only 1024*1024*5*2 bytes ~ 10 Mb height data.   Since you have all the data you need in the clip map, you don't need as many vertices as the size of the heightmap to render the terrain. Just create a 16x16 patch of vertices (15x15 rectangles), don't need any uvs, and reuse it to draw the terrain. In the vertex shader calculate the world position of the current vertex and use that to sample the clipmap in the correct position/layer. Since you only have full res height data close to the camera, as soon as you start to use different layers of the clipmap render patches doubling the distance between vertices so each vertex matches one pixel of terrain data.   You'll run into problems in the borders when you start to render patches using a different layer because the height data won't match perfectly. This GPU Gems article explains the types of patches to use and how to hide the seams between patches with different levels of detail.   When the camera moves (enough) you need to update some layers of the texture clipmap with data from the full map texture you have on disk. Doing toroidal access allows you to only update parts of the texture instead of having to move the parts that are still relevant over old parts and fill the "empty" space with new height data.   Some useful links: http://www.gamedev.net/topic/652777-geometry-clipmaps-terrain-tutorial-with-source/ http://www.vertexasylum.com/downloads/cdlod/cdlod_latest.pdf (good solution for seams between layers, I use a modified version of this in my demos and it works very well) link to full source code in the end of paper)
2. ## D3D12 Best Practices

You will still need a per frame constant buffer resource. You just won't need a per frame entry in a descriptor table for that CBV.   The only way to not need a constant buffer resources is to store constants directly inside the root signature, but you have very limited root signature memory, so you won't be able to store everything in the root signature.

You can use ID3D11ShaderReflection::GetThreadGroupSize to get the thread group size of a compute shader in C++.   I do it in my engine right after creating the compute shader and store the thread group size together with the compute shader pointer in a struct, so I always have access to the info when I want the use the shader.
4. ## Cheap but realistic lighting engine: Is it a good idea?

You haven't described anything new  but if you came up with that on your own, great!   Many engines and games use lightmaps, just google a bit.   You also, basically, described reflective shadow maps (also here).   So yes, it will work but you will have to deal with the limitations. Eg: How will you handle reflections?
5. ## Uses for unordered access views?

There are many possible uses:   1 - Update particle systems on the GPU. Read from previous frame state (ConsumeBuffer), run simulation for the particle and append to the current frame (AppendBuffer). Draw using draw indirect. You can also do this in the geometry shaders (try both and see which is faster)      This also works for other types of geometry like vegetation, water simulation, finding pixels that need bokeh, cloth simulations etc.   2 - Calculate per tile light lists (forward+ rendering) or tiled deferred lighting (tile culling and lighting runs on compute shaders). The results are stored in UAV that are then read as SRVs.   3 - Some full screen effects might benefit from compute shaders groupshared memory, so you'll use UAVs.   4 - and many more, just google a bit.   Check out this GDC presentation.
6. ## Dealing with different collision responses in an entity-component system

This is how I would do it:   The CollisionComponent should only contain data related to the physics system (is this a rigid body, a trigger region?, friction, restitution, etc)   PhysicSystem: does all physics related math and outputs a list of collisions that occurred. struct Collision {     handle object_a;     handle object_b;     //.... other collision info }   CollisionHandlerSystem (this is a high level game specific system): Takes the list of collisions (from above) as input and does the necessary processing depending on the types of objects involved in the collisions. This is where you would handle the logic you mentioned. This system could generate a list of messages that could be consumed by other systems, like "destroy entity X", "damage entity Y", etc (good for multi threading) or talk directly with other systems (careful with race conditions).   You could also create another component (CollisionCallbackComponent) and system (or the CollisionHandlerSystem could do it) to call entity specific collision callbacks.   This way you could efficiently handle global game logic, and still provide support of custom callbacks.

8. ## Want to create a cloud in 3d array

This will provide good control over the overall shape and position of the clouds while keeping a more realistic random appearance.   Same technique used to generate explosions (slide 57), also this video.
9. ## Want to create a cloud in 3d array

EDIT: Turns out I misread the question. Check answers below   You can render it using Volume Rendering techniques.   IMO, the easiest way is to put that data in a 3D texture and ray march in the pixel shader. For each pixel, compute the pixel's view direction and sample the 3D texture at N steps from the near to the far plane.   You can also convert the 3D array into a distance field to improve performance, etc.     Another method

11. ## Direct3D 12 documentation is now public

It can be useful for some effects like unbird mentioned. Example: Rendering light bounding volumes in Deferred rendering. You bind a read-only DSV to enable depth testing and a SRV of the same depth texture that the shader will sample to reconstruct the pixels position.
12. ## Going out of bounds in a rwtexture

Out of bounds UAV reads return 0s and writes result in No-Ops(nothing being written). More info here (slide 16)   So you shouldn't run into any problems.   I think the only case you have to be careful is appending to a AppendStructuredBuffer because out of bounds doesn't write but still increases the internal counter.
13. ## Some basic help on fxc (compiling shaders)

You're incorrectly specifying the 'defines', the command should have a /D NAME=VALUE for each define (the =VALUE part is optional)   So it should be: fxc /T ps_3_0 /E VS_function /Fo comptst.fxc /D AMBIENT_HEMI /D FOGGING /D USENORMALMAP /D SPECULAR CR_ushader_v1.fx If your .fx file contains a vertex shader and a pixel shader you have to compile it twice, once with the target vs_3_0 and the entry point 'your_vs_function_name', and a second time with ps_3_0 and entry point 'your_ps_fuction_name'   That's why your example works with vs_3_0 and not with ps_3_0, you're not setting the correct entry point name for the pixel shader.
14. ## Direct3D 12 documentation is now public

Typically mapped memory is uncached, and so writes will bypass the cache completely. For these cases write combining is used to batch memory accesses.     So I'm assuming that the D3D11 advice of never reading from mapped memory using the CPU (unless it's read-back memory) still applies.   Is this true for all types of GPU accessible memory? Eg: will it be slow to read a descriptor from a descriptor heap using the CPU?   According to Game Engine Architecture, the PS4 GPU can access memory via a cache coherent bus, is something like this available on PC?
15. ## C++ cant find a match for 16 bit float and how to convert 32 bit float to 16 bit one

I've been using this code written by Mike Acton from Insomniac to convert between float and *half*.   You might need to change inline to __inline or the extension from .c to .cpp when working with visual studio.   You can use a union to convert float to/from uint32_t:     union Helper {     float f;     uint32_t u; };   Helper helper; helper.f = 5.0f; uint16_t h = half_from_float(helper.u);   //and back   helper.u = half_to_float(h);   float y = helper.f;