
Advertisement
Anfaenger
Member
Content Count
51 
Joined

Last visited
Community Reputation
827 GoodAbout Anfaenger

Rank
Member
Personal Information

Role
Programmer

Interests
Design
Education
Programming
Recent Profile Visitors
The recent visitors block is disabled and is not being shown to other users.

Cascaded Voxel Cone Tracing GI  How to sample sky color?
Anfaenger replied to Anfaenger's topic in Graphics and GPU Programming
Thanks for your informative answer! I found another nice presentation from SIGGRAPH 2019: Practical dynamic lighting for largescale game environments 
3D Cascaded Voxel Cone Tracing GI  How to sample sky color?
Anfaenger posted a topic in Graphics and GPU Programming
I've implemented a basic version of Voxel Cone Tracing that uses a single volume texture (covering a small region around the player). But I want to have large and open environments, so I must use some cascaded (LoD'ed) variant of the algorithm. 1) How to inject sky light into the voxels and how to do it fast? (e.g. imagine a large shadowed area which is lit by the blue sky above.) I think, after voxelizing the scene I will introduce an additional compute shader pass where, from each surface voxel, I will trace cones in the direction of the surface normal until they hit the sky (cubemap), but, I'm afraid, it would be slow with Cascaded Voxel Cone Tracing. 2) How to calculate (rough) reflections from the sky (and distant objects)? If the scene consists of many "reflective" pixels, tracing cones through all cascades would destroy performance. Looks like Voxel Cone Tracing is only suited for smallish indoor scenes (like Doom 3style cramped spaces). 
DX11 Voxelization (VCT GI)  Handing outofbounds writes
Anfaenger posted a topic in Graphics and GPU Programming
I'm implementing singlepass surface voxelization (via Geometry Shader) for Voxel Cone Tracing (VCT), and after some debugging I've discovered that I have to insert outofbounds checks into the pixel shader to avoid voxelizing geometry which is outside the voxel grid: void main_PS_VoxelTerrain_DLoD( VSOutput pixelInput ) { const float3 posInVoxelGrid = (pixelInput.position_world  g_vxgi_voxel_radiance_grid_min_corner_world) * g_vxgi_inverse_voxel_size_world; const uint color_encoded = packR8G8B8A8( float4( posInVoxelGrid, 1 ) ); const int3 writecoord = (int3) floor( posInVoxelGrid ); const uint writeIndex1D = flattenIndex3D( (uint3)writecoord, (uint3)g_vxgi_voxel_radiance_grid_resolution_int ); // HACK: bool inBounds = writecoord.x >= 0 && writecoord.x < g_vxgi_voxel_radiance_grid_resolution_int && writecoord.y >= 0 && writecoord.y < g_vxgi_voxel_radiance_grid_resolution_int && writecoord.z >= 0 && writecoord.z < g_vxgi_voxel_radiance_grid_resolution_int ; if( inBounds ) { rwsb_voxelGrid[writeIndex1D] = color_encoded; } else { rwsb_voxelGrid[writeIndex1D] = 0xFF0000FF; //RED, ALPHA } } Why is this check needed, and how can I avoid it? Shouldn't Direct3D automatically clip the pixels falling outside the viewport? (I tried to ensure that outofbounds pixels are clipped in the geometry shader and I also enable depthClip in rasterizer, but it doesn't work.) Here's a picture illustrating the problem (extraneous voxels are highlighted with red): And here the full HLSL code of the voxelization shader: 
Calculate vertex normals for correctly lighting sharp features in a voxel terrain
Anfaenger posted a topic in Graphics and GPU Programming
I have an indexed triangle mesh generated by Dual Contouring. I need to duplicate vertices lying on sharp features (i.e. corners and edges) and assign correct vertex normals ('flat' normals of the faces referencing those sharp vertices). Otherwise, lighting will look incorrect: the objects look odd and 'blobby', see the first picture below, where each vertex is shared by more than one face. [attachment=36223:lighting sharp features.png] Currently, I build from the triangle mesh a vertex <> face adjacency table, duplicate each sharp vertex (its feature dimension (corner,edge,plane) is known from Dual Contouring) for each face referencing this vertex and assign face normals to new vertices (the third picture in the row; there are also lighting seams across chunks, but that's a different problem). This is fast and easy, but leads to artifacts on sharp edges, where adjacent vertices in smooth regions shouldn't be split, as in the following image: [attachment=36224:sharp edges  lighting bug.png] I guess, I should identify and split crease edges instead of sharp vertices, but, unfortunately, meshes generated by Dual Contouring are not always 2manifold (e.g. they often have singular vertices (hourglasslike shape) and edges shared by four polygons). How can vertex normals be calculated for correctly lighting sharp features in this case? Is there a quick & dirty, simple and fast solution instead of a generic one for nonmanifolds? (For 2manifold meshes I can quickly build a compac (half)edge adjacency table using a linear scan and sorting.) 
Implementing an efficient memory cache (or choosing an existing library)
Anfaenger posted a topic in General and Gameplay Programming
Hi! I'm writing a procedural voxel>polygon engine and I need to frequently generate/save/load and destroy chunks. I'd like to avoid regenerating, and especially loading (== touching the disk) chunks, which have been recently used, by employing a memory cache. Ideally, I'd like to reserve a fixedsize memory block for the 'chunk cache', address cached chunks by their IDs and specify chunk eviction policy/expiration time. How to best implement such a cache? (i.e. fast and efficient, no fragmentation.) Could you give me any references to existing implementations? I found these libraries, but they seem too 'heavy' for my toy voxel engine: http://cachelot.io/ and memcached Is there anything more lightweight? 
Procedural voxel planet generation  precision problems
Anfaenger posted a topic in Graphics and GPU Programming
Hi! I'd like to render a procedurallygenerated Earthsized voxel planet (i.e. not using heightmaps and quadtrees). I have mostly working viewdependent octreebased crackfree rendering of a gigantic smooth (i.e. not blocky/Minecraftish) isosurface. If I make the world too big (16 LoDs), floatingpoint precision (?) starts to break down: the surface is no longer smooth but dimpled, edges become jagged, and large uglylooking black spots appear (wrong normals?). 1) How are such precision problems dealt with when generating very large terrains? Should I simply use 'doubles' as e.g. in the spacesim game Pioneer? 2) My isosurface sampling is slow, it runs on CPU and is implemented via an abstract 'SDF' interface, with lots of 'virtual's, etc. How should procedural generation be implemented on GPU? Should I use compute shaders or OpenCL? Should I post new 'GenerateChunk' requests in the current frame and retrieve them in the next frame to avoid stalls? Should I use 'doubles' on the GPU? 
Multithreaded frustum culling  how to gather and where to store visibility results?
Anfaenger posted a topic in Graphics and GPU Programming
Hi, I'd like to implement multithreaded frustum culling in my graphics engine, but I don't know how to best collect and 'merge' the results of visibility testing (i.e. an array of visible entities). I'm rewriting the renderer for fast frustum culling (i.e. storing AABBs (in SoA format) and pointers to graphics entities in two contiguous arrays), but I feel that having each worker thread writing pairs of {object_type, object_pointer} into a single array would be inefficient because of locking. What is a good (best?) way to gather visibility results from multiple threads and merge them into a single array for further processing? (I'm ditching hierarchical (octreebased) frustum culling in favor of the DOD approach, because the former is 'branchy' and unthreadable (and also more messy).) 
Implementing a fast dynamic vertex/index buffer pool
Anfaenger posted a topic in Graphics and GPU Programming
Hi! I'm implementing a destructible voxel terrain. For rendering changeable, dynamic geometry I need to frequently allocate and free dynamic vertex and index buffers. What is the preferred way to organize VBs/IBs for quick allocations and frees? (The usual approach is to organize buffers in several bins, round the size of each chunk up to the next power of two and iterate the linked list in the bin. But it degenerates to linear search if many chunks have almost equal size.) 
Checking if an octree node should be split (refined) or merged (collapsed)
Anfaenger replied to Anfaenger's topic in Graphics and GPU Programming
I've implemented your idea, it does work! Here is my code in case anybody would find it useful: int Calculate_Desired_LoD( const CellID& _nodeId, const V3f& _eyePosition ) { const int iNodeLoD = _nodeId.GetLOD(); const int iNodeSize = (1 << iNodeLoD); // size of this node, in LoD0 chunks // The bounding box of the node. AABBi nodeAABBi; nodeAABBi.mins = _nodeId.ToInt3() * iNodeSize; nodeAABBi.maxs = nodeAABBi.mins + Int3(iNodeSize); // The bounding box of the LoD0chunk (smallest) containing the view point. AABBi eyeAABBi; // Mincorner coordinates of the chunk containing the view point, in LoD0 chunks. eyeAABBi.mins = Int3( Float_Floor( _eyePosition.x / CHUNK_SIZE ), Float_Floor( _eyePosition.y / CHUNK_SIZE ), Float_Floor( _eyePosition.z / CHUNK_SIZE ) ); eyeAABBi.maxs = eyeAABBi.mins + Int3(1); const int distance = Calc_Chebyshev_Dist( nodeAABBi, eyeAABBi ); // in LoD0 chunks const int distance1 = Max( distance, 1 ); // bsr intrinsic expects arguments > 0 // LoD ranges increase as powers of two DWORD wantedLoD; _BitScanReverse( &wantedLoD, distance1 ); // log2(distance) return wantedLoD; } /// Computes Chebyshev distance between the two integer AABBs. int Calc_Chebyshev_Dist( const AABBi& _a, const AABBi& _b ) { Int3 dists( 0 ); // always nonnegative for( int axis = 0; axis < NUM_AXES; axis++ ) { const int delta_min = Abs( _a.mins[ axis ]  _b.mins[ axis ] ); const int delta_max = Abs( _a.maxs[ axis ]  _b.maxs[ axis ] ); if( _a.mins[ axis ] >= _b.maxs[ axis ] ) { dists[ axis ] = _a.mins[ axis ]  _b.maxs[ axis ]; } else if( _a.maxs[ axis ] <= _b.mins[ axis ] ) { dists[ axis ] = _b.mins[ axis ]  _a.maxs[ axis ]; } else { dists[ axis ] = 0; } } return Max3( dists.x, dists.y, dists.z ); } I use it so: if( _node.id.GetLoD() > Octree::Calculate_Desired_LoD( _node.id, eyePos ) ) then Split( _node ); It causes square, clipmapstyle LoD regions (the third picture; the first uses 1norm and the 2nd  Euclidean distance): But this function cannot still be used for neighbor LoD calculations. And it uses branches. I also use closest distance between eye position and the node's bounding box (or between bounding boxes). Otherwise, LoD regions may become rectangular. 
Checking if an octree node should be split (refined) or merged (collapsed)
Anfaenger posted a topic in Graphics and GPU Programming
Hi there! In my octreebased voxel terrain engine, to prevent cracks, each octree node (chunk) must know the LoDs/sizes of its neighbors. Currently, this info is stored as 18bit adjacency masks (for 6 face and 12 edgeadjacent nodes), with costly neighbor finding to update those masks, for each octree node (like four hundred nodes translating into two thousand checks!). I'd like to avoid maintaining adjacency masks and instead use an algorithmic approach. I need a fast function for checking if an octree node at the given position and with the given LoD/size exists in the octree, without walking the entire hierarchy. The function should preferably be integerbased (i.e. not use floatingpoint) and cause a square, clipmapstyle node arrangement. Here's the function I use for updating the octree; it ensures 2:1 LOD transitions, but sometimes creates rectangular 'rings' instead of cubic: /// returns true if the given node requires splitting bool Node_Should_Be_Split( const OctreeNode& _node, const OctreeWorld::Observer& _eye ) { const UINT nodeLoD = _node.id.GetLOD(); const UINT nodeSize = (1u << nodeLoD); const UINT nodeRadius = nodeSize / 2u; const UInt3 nodeCenterCoords = (_node.id.ToUInt3() * nodeSize) + UInt3(nodeRadius); const UINT distance = Chebyshev_Distance( Int3::FromXYZ(_eye.coordsInChunksGrid), Int3::FromXYZ(nodeCenterCoords) ); // Chebyshev distance is the 'smallest' norm: max(dx, dy, dz) <= distance(dx,dy,dz) <= dx + dy + dz. return distance < nodeSize * LOD_RANGE + nodeRadius; // this node is far away } /// Chebyshev distance (aka "[TchebychevChessboardCenter] Distance" or "maximum [metricnorm]"): /// the maximum of the absolute rank and filedistance of both squares. template< typename TYPE > TYPE Chebyshev_Distance( const Tuple3< TYPE >& _a, const Tuple3< TYPE >& _b ) { return Max3( Abs(_a.x  _b.x), Abs(_a.y  _b.y), Abs(_a.z  _b.z) ); } 
Pixelsized gaps at LOD transitions after compressing vertices
Anfaenger posted a topic in Graphics and GPU Programming
Hi there! I'm writing a voxel terrain engine which uses geomorphing/CLOD to render a seamless mesh which is composed of chunks with different LoDs (sizes and resolutions). Each vertex stores its attributes both at the current and the next, twice coarser LoD; the correct attributes are selected in the vertex shader based on the size of the chunk's neighbors. To reduce the size of a vertex to 32 bytes, I'm compressing vertex positions into 32bit integers: {x:11,y:11,z:10}. But this memory 'optimization' results in tiny gaps (dropped or shimmering pixels) at LOD transitions, like these: To avoid these gaps, boundary vertex positions must precisely match between adjacent chunks. I tried to quantize fine positions to 10 bits and coarse positions to 5 bits (so that 'quantization level/precision loss' would match between adjacent chunks which resolutions differ by 2), but that didn't work. Should I store vertex positions at full precision? Maybe, there is a simple and memoryefficient solution based on snapping positions to some global grid? 
Voxel Terrain  Updating LOD hierarchy after modifications
Anfaenger posted a topic in Graphics and GPU Programming
Hi! I'd like to be able to edit a procedurally generated voxel terrain. The terrain is represented by an octree which subdivides itself depending on the viewer's position. LODs are generated ondemand, progressively, from top to bottom. Each leaf node contains a mesh representing a part of the terrain surface at this node's LOD. Cracks are handled via geomorphing. I can fly over a large (~20 km) and boring terrain without any cracks/Tjunctions (except for pixelsized gaps far from camera). Here is a picture from a week ago (gyroid, each LOD is 4x4x4 cells): 1) How to regenerate the LOD hierarchy after some parts of the terrain have been modified? I want to be able to edit the floor right in front of the player (LOD0) as well as distant mountains (e.g., LOD6). In the former case, LODs of can be rebuilt lazily, via 'dirty flags'. But for faraway montains, LODs should be updated quickly, because they are currently being used for rendering. So, when recalculating LODs of 'dirty' nodes, I need to check if their parent is being used, and so on, all the way up the tree? Or maintain a queue with IDs of changed nodes, and after processing all nodes of LOD 'N' push onto the queue the IDs of their parents (LOD 'N+1')? There are other complications, e.g. ensuring that old border vertices are kept untouched during simplification so that the simplified, 'bottomup' mesh connects seamlessly with procedurallygenerated, 'topdown' parts. Actually, I have no idea how all of it should be implemented together. Could you please refer me to papers/code, or give me some ideas? 
Dual contouring implementation on GPU
Anfaenger replied to BlackJoker's topic in Graphics and GPU Programming
IIRC, Ronen Tzur's (Uniform) Dual Contouring sample contains some wellcommented numerical code. https://www.sandboxie.com/misc/isosurf/isosurfaces.html For the underlying theory, I can recommend the excellent book "Matrix Analysis & Applied Linear Algebra" [2000] by Carl D. Meyer, apart from classics such as "Matrix computations" and "Numerical Recipes in C". (I'm in no way a math geek, but the first book explains the theory very well.) 
Dual contouring implementation on GPU
Anfaenger replied to BlackJoker's topic in Graphics and GPU Programming
I use the standard approach: find the vertex which position minimizes the squared distance to the tangent planes, i.e. planes built from intersection points (of the cell's edges with the surface) and the surface normals at those points. If the minimizer lies outside the cell, I clamp it to the cell's bounds or place it into the mass point (average) of all intersection points (which causes unsightly dimples and notches in the reconstructed surface, e.g. see MergeSharp, SHREC papers). DC/DMC are far from perfect  a sharp conelike feature gets clamped if it passes through a face, but doesn't intersect any of the cell's edges. You can try to reconstruct a rotated cube, with one of its corners pointing up, and most likely that corner will get 'chamfered'. In my framework, I use different 'QEF' solvers (using a common 'virtual' interface): 1) QEF_Solver_Bloxel  simply places the vertex in the cell's center (for blocky, Minecraftstyle / Bloxel / Boxel / Cuberille worlds). 2) QEF_Solver_Simple  simply places the vertex into the mass point (i.e. averages all intersections  smooth, SurfaceNets style). 3) QEF_Solver_ParticleBased  uses Leonardo Augusto Schmitz's easytounderstand method with exact normals at intersection points to reduce complexity, can reconstruct smooth surfaces with sharpish features: http://gamedev.stackexchange.com/a/83757, "Efficient and High Quality Contouring of Isosurfaces on Uniform Grids, 2009, Particlebased minimizer function". I think, some google summerofcode Java impl used this method with ADC. 4) QEF_Solver_Direct_AtA  tries to solve the Least Squares problem using normal equations: A^T*A*x = A^T*b, simple and fast, but unstable. For an example you can see that swedish guy's impl on github, Emil (forgot his name), his blog is called "I love tits". 5) QEF_Solver_QR  Least Squares Solver from original authors (Scott Schaefer et al., 2011). 6) QEF_Solver_SVD_Eigen  uses the (excellent) Eigen math lib to solve LLS, e.g. as here: https://github.com/mkeeter/ao/blob/master/kernel/src/render/octree.cpp 7) QEF_Solver_SVD2  based on the code by Nick Guildea: https://github.com/nickgildea/qef 
How to pass LOD blend factors to vertex shader for CLOD in a voxel terrain?
Anfaenger posted a topic in Graphics and GPU Programming
[!EDIT!: I should've written this question as "How to ensure matching LOD blend factors on chunk faces in a voxel terrain?" Hi! I'm trying to adapt "Continuous DistanceDependent Level of Detail for Rendering Heightmaps (CDLOD)" by Filip Strugar to voxel terrain. In the paper above, geomorphing is used to solve two problems: 1) Eliminating cracks between chunks of different LODs; 2) Achieving smooth, continuous LOD transitions. For this to work, LOD morph/blend factors of vertices on "touching" boundary faces from adjacent chunks must coincide. E.g., I know that some particular octree node is touching a twice bigger node on PosX face, so for this smaller node all vertices on PosX face must have their LOD blend factors set to 1.0f (i.e. fully morph to parent vertex). At the same time, to prevent cracks NegXvertices of the bigger node must have LOD blend factors equal to zero. How to ensure in the vertex shader that blend factors of touching, boundary vertices concide between chunks? In the vertex shader, dynamically computed LOD blend factors must be set to their corresponding BoundaryFace_LOD_Factor? Do I need to associate each boundary vertex with the cube face on which the vertex lies?

Advertisement