Jump to content
  • Advertisement


  • Content Count

  • Joined

  • Last visited

Community Reputation

360 Neutral

About dr4cula

  • Rank
  1. dr4cula

    D3D12 Best Practices

      You will still need a per frame constant buffer resource. You just won't need a per frame entry in a descriptor table for that CBV.   The only way to not need a constant buffer resources is to store constants directly inside the root signature, but you have very limited root signature memory, so you won't be able to store everything in the root signature.     Thanks for your quick reply! I was a bit confused due to the wording by MS here: "By putting something in the root signature, the application is merely handing the versioning responsibility to the driver, but this is infrastructure that they already have."
  2. dr4cula

    D3D12 Best Practices

    Thought I shouldn't start a new topic as it is kind of generic DX12 question: if I put my CBV in the root signature, does the driver version the data automatically, i.e. I wouldn't need a per-frame constant buffer resource?   Thanks!
  3. dr4cula

    D3D12 Best Practices

    So I've been doing further testing with the samples and when I compare the fullscreen modes in those applications to the ones in DX11, the differences are quite marginal:   Avg FPS DX11 (custom app): 8575 Avg FPS DX12 (Bundles app): 4518   Both versions use the same presentation model (FLIP_DISCARD), so I'm not sure where the difference comes from? I've only enabled the render target clearing and present operations, everything else is commented out in the apps.   Furthermore, following the idea of per frame CBs, if I perform renders to texture, should I queue them also on a per frame basis similar to the constant buffers? (nvm: the Multithreading sample provided me with the answer that yes this is the case indeed)   Thanks in advance!
  4. dr4cula

    D3D12 Best Practices

    Thank you both for answering!   I'm still a bit confused about the synchronization though - could you please explain it in more detail?   I've implemented the following NextFrame() sync function that is called immediately after Present(): bufIndex_ = swapChain_->GetCurrentBackBufferIndex();   frameFences_[currFrameIndex_] = fenceValue_; THROW_FAILED(directCommandQueue_->Signal(fence_.Get(), fenceValue_)); ++fenceValue_;   // advance to the next frame currFrameIndex_ = (currFrameIndex_ + 1) % bufferCount_; UINT64 lastCompletedFence = fence_->GetCompletedValue();   if ((frameFences_[currFrameIndex_] != 0) && (frameFences_[currFrameIndex_] > lastCompletedFence)) { THROW_FAILED(fence_->SetEventOnCompletion(frameFences_[currFrameIndex_], fenceEvent_)); WaitForSingleObject(fenceEvent_, INFINITE); }   By increasing bufferCount_ to 3, I can achieve 120fps and using 4 I can achieve 180fps but anything higher than that won't net me anything further. Also, I'm not sure how to interpret FPS tracing I've set up, in the sense that if I count the times my Render() loop is reached in one second, I get 180FPS but when I query the actual time passed, I get much higher values for some frames, e.g. see below where the first number is the FPS based on the actual time spent on Render() call and the one in brackets is the number of times Render() got called per second. I assume the 2nd is the actual FPS but I'm not quite sure how to interpret the first one - does it even mean anything? (similar output can be produced in the samples by outputting 1.0f / m_timer.GetElapsedSeconds() instead of GetFramesPerSecond()).   FPS: 65 (182) FPS: 1663 (182) FPS: 1693 (182) FPS: 65 (182) FPS: 1294 (182) FPS: 2066 (182) FPS: 63 (182) FPS: 1058 (182) FPS: 1739 (182) FPS: 64 (182) FPS: 1741 (182) FPS: 2245 (182) FPS: 65 (182) So, I still have no idea how my D3D12 app compares against my D3D11 app: for my D3D11 app, I use the 1.0f / GetElapsedSeconds() method for acquiring max framerate, however what is the equivalent of that in D3D12?   Also, I decided to have a look at the GPU execution times with the VS2015 debugger (debug->start diagnostic tools without debugging) but all of the Event Names are Unknown so I can't really tell which API calls are which. Is this feature not yet supported for D3D12?   Thanks in advance once more!  
  5. Hi all,   I've been playing around with D3D12 and while going through the samples I've run into a couple of questions that I've yet to find an answer for. The questions I have are the following:   1) How many descriptor heaps should an application have? Is it vital for an app to have one SRV/RT/DSV heap or is it fine to have several smaller heaps related to specific rendering tasks (I'm specifically wondering whether cmdList->SetDescriptorHeaps() can cause any possible cache coherency issues)? I remember reading somewhere that an app should have only one heap of each type but I can't remember where I saw it so my memory might just be letting me down at this point.   2) How should constant buffers be handled? Throughout the samples I found that often times the applications created several constant buffers based on the exact same structure for different draw calls, e.g. instead of using map() on application init and then memcpy() to load per draw-call data into the constant buffer, the apps seemed to create n amount of constant buffers instead and used descriptor tables to handle correct resource referencing. Is that the way it should be done or have I misunderstood something (e.g. see the Bundles example)?   3) More generally, how should frame resources be handled? This follows from the fact that the apps seem to be creating n times the number of resources used per frame: e.g. using double-buffered rendering, constant buffer descriptor heap size is given as 2 * numCBsPerFrame (where numCBsPerFrame is an array of CBs for different draw calls) (number of command lists seem to be allocated in a similar manner). What is the reason for doing this? I think this has something to do with GPU-CPU synchronization: preventing read/write clashes but I'm not sure.   4) What would be the suggested synchronization method? I'm currently using the one provided in the HelloWorld samples, i.e. I'm waiting for the GPU to finish before continuing to the next frame. This clearly isn't the way to go as my fullscreen D3D11 app runs at ~6k FPS whereas my D3D12 app runs at ~3k FPS. Furthermore, how would one achieve max framerate in windowed mode (I've seen this video but I don't really follow the logic - taking the last option, wouldn't rendering something for the sake of rendering cause possible stalls? I don't really understand this). Is the swapchain's GetFrameLatencyWaitableObject() useful here?   Thanks in advance!
  6. dr4cula

    Bezier Teapot Tessellation

      Ah! I see, yes, when I was looking at the model in wireframe, indeed the top part looked triangular but I didn't associate this with the fact that tessellator is using the quad domain.   Your "solution" worked out great: static const float epsilon = 1e-5f; float u = min(max(coordinates.x, epsilon), 1.0f - epsilon); float v = min(max(coordinates.y, epsilon), 1.0f - epsilon); Thanks a lot! :)
  7. Hi,   I've been messing around with the tessellation pipeline and Bezier surfaces and I seem to have run into a strange artefact that I just can't figure out: it only seems to appear for the teapot model (from here) but surely this widely used model isn't broken? Here's what happens when I'm tessellating it. My method is similar to the one presented here slides 23-25, except it seems that the Bezier evaluation function given on those slides is summing u and v basis functions backwards, hence my version evaluates the control points as follows: int ptIndex = 0; for(int i = 0; i < 4; ++i) { for(int j = 0; j < 4; ++j) { position += patch[ptIndex].position * bu[i] * bv[j]; du += patch[ptIndex].position * dbu[i] * bv[j]; dv += patch[ptIndex].position * bu[i] * dbv[j]; ++ptIndex; } }   Here bu/bv and dbu/dbv are the Bernstein polynomials/derivatives of Bernstein polynomials for the tessellated uv parameter coordinates. If I swap the polynomial multiplication order, e.g. for the position set multiplication to bv * bu[j], then I get the exact same, inside-out model as with the copy-paste code from the given slides (the artefact is still there, just need to move the camera inside of the model to see it). After debugging this for the entire day I'm beginning to think it might be a problem with the model but like I said - seems unlikely considering the popularity of the model. Does anyone have any experience with Bezier teapot tessellation and could chip in? I've tried the other models from Newell's teaset, i.e. the spoon and the cup, and neither seemed to have any similar artefacts. If anyone could recommend any (advanced) test Bezier models, I'd be grateful!   Thanks in advance!
  8. dr4cula

    Heightfield Normals

        That's what I was afraid of. I remember encountering something similar when I was messing about with terrain generation.     Thanks for the link! Scouring various papers on simulation methods, I ran into a similar suggested technique. However, upon implementing it, I really couldn't tell much of a difference. I also tried writing the normals onto a texture and sampling but again, no visible difference was achieved. I guess I should look into getting the reflection/refraction working and seeing how apparent the issue then is.   Thanks for your quick replies!
  9. Hi,   I've implemented a SW equation solver and everything seems to be working OK, except for the normals. Here's what I mean: http://postimg.org/image/d3kah7ow5/   Here's how I'm calculating the normals in the vertex shader: float3 tangent = float3(1.0f, (r - l) / (2.0f * dx), 0.0f); float3 bitangent = float3(0.0f, (t - b) / (2.0f * dx), 1.0f); float3 normal = cross(bitangent, tangent); normal = normalize(normal); r/l/t/b are left/right/top/bottom neighbours. I can fake the normal upwards by scaling the xz components but that doesn't seem like the most optimum way of doing it. Surely there's a better way?   Thanks in advance!
  10. dr4cula

    Shallow Water

      Height is stored only in the first component of a 4-component render target, i.e. currently the 3 other components of the rendertarget/texture go unused. I've not been able to make any progress on this on my own...
  11. dr4cula

    Shallow Water

    Hi,   I've been experimenting with the shallow water equations but I can't seem to get my implementation correct. I'm following this except I'm doing everything on the GPU. I'm not sure where I keep going wrong: see here. From my experimentations with the full Navier-Stokes equations, this makes sense: I remember getting visually similar results (in 2D) where the circle forms into a square-like corner (plotted a circle with velocity (1,1) at every pixel). But this only happened when I stopped the simulation after the advection step ("skipped" projection). Not sure what is happening here. I've tried changing the signs when sampling data as well as switching the order of operations around but nothing seems to work. At one point I ended up with this, which is obviously not correct.   Here are my simulation kernels (I won't post my advection kernel as it is the same one I used for my full NS solver; also note that using a staggered grid whereby a single pixel represents left-bottom pair of velocities for velocity kernels (boundaries are set appropriately to account for the array size differences)):   UpdateHeight kernel float4 PSMain(PSInput input) : SV_TARGET { float2 texCoord = input.position.xy * recTexDimensions.xy;   float vL = velocity.Sample(pointSampler, texCoord).x; float vR = velocity.Sample(pointSampler, texCoord + float2(recTexDimensions.x, 0.0f)).x; float vT = velocity.Sample(pointSampler, texCoord + float2(0.0f, recTexDimensions.y)).y; float vB = velocity.Sample(pointSampler, texCoord).y;   float h = height.Sample(pointSampler, texCoord).x;   float newH = h - h * ((vR - vL) * recTexDimensions.x + (vT - vB) * recTexDimensions.y) * dt;   return float4(newH, 0.0f, 0.0f, 0.0f); }   UpdateU: float4 PSMain(PSInput input) : SV_TARGET { float2 texCoord = input.position.xy * recTexDimensions.xy;   float u = velocity.Sample(pointSampler, texCoord).x;   float hL = height.Sample(pointSampler, texCoord - float2(recTexDimensions.x, 0.0f)).x; float hR = height.Sample(pointSampler, texCoord).x;   float uNew = u + g * (hL - hR) * recTexDimensions.x * dt;   return float4(uNew, 0.0f, 0.0f, 0.0f); }   UpdateV: float4 PSMain(PSInput input) : SV_TARGET { float2 texCoord = input.position.xy * recTexDimensions.xy;   float v = velocity.Sample(pointSampler, texCoord).y;   float hB = height.Sample(pointSampler, texCoord - float2(0.0f, recTexDimensions.y)).x; float hT = height.Sample(pointSampler, texCoord).x;   float vNew = v + g * (hB - hT) * recTexDimensions.y * dt;   return float4(0.0f, vNew, 0.0f, 0.0f); }   I've literally spent the entire day debugging this and I've got no idea why nothing seems to work... Hopefully some of you guys have implemented this before and can help me out.   Thanks in advance!
  12.   Thanks for your reply! Indeed, I found a couple of implementations online using this 0.5f constant but I wasn't sure as to why it was needed. However, adding this -0.5f term as you've said still doesn't make the function produce the same output as the sampler operation. Here's the result I now get.   Also, how would I renormalize the weights based on the values of the texels involved in the interpolation? If I have a texture where texels that are set to 0 are unknown and any other value means that the texel's value is known (i.e. should be used for interpolation) then how exactly do I bias the interpolation towards the texel with a known value? I hope this makes sense....   Thanks again!   EDIT: FYI, without the 0.5f bias I get this result EDIT2: Tested this also on a sample program where the values of a texture are interpolated to another texture that is 2x smaller and this was the result. The custom bilinear interpolation functions seem to have a bias towards the top left corner (the circle has shifted from the center of the texture towards the top left corner).
  13. Hi,   I need to change the default bilinear interpolation functionality to only use known values for interpolation by biasing the weights based on the contents of the 4 pixels, e.g. if one of the four pixels is (0.0, 0.0, 0.0, 0.0) then bias the lerp() function towards the nonzero neighbouring pixel. At least that's what I've understood I need to do for extrapolating velocities (stored in textures) for my fluid simulation, direct quote here:       I've written a bilinear interpolation function based on what I've found on the web: static const float2 recTexDim = float2(1.0f / 48.0f, 1.0f / 48.0f); // as a test case, the source image is 48x48   float4 bilerp(float2 texCoord) { float2 t = frac(texCoord / recTexDim);   float4 tl = source.Sample(pointSampler, texCoord); float4 tr = source.Sample(pointSampler, texCoord + float2(recTexDim.x, 0.0f)); float4 bl = source.Sample(pointSampler, texCoord + float2(0.0f, recTexDim.y)); float4 br = source.Sample(pointSampler, texCoord + float2(recTexDim.x, recTexDim.y));   return lerp(lerp(tl, tr, t.x), lerp(bl, br, t.x), t.y); }   float4 PSMain(PSInput input) : SV_TARGET { float4 custom =  bilerp(input.texCoord); float4 builtIn = source.Sample(linearSampler, input.texCoord);   return abs(custom - builtIn); } If my code were correct, the resulting image should be all black (i.e. no difference between the interpolation functions). However I get the following result: http://postimg.org/image/3walbutj1/   I'm not sure where I'm going wrong here. Also, have I even understood the "renormalizing interpolation weights" bit correctly?   Thanks in advance!
  14. Hi,   I've been trying to convert my simple 2D fluid solver to a liquid solver with a free surface. For that I've introduced the level set method into the application. However, I seem to be losing mass/volume at a ridiculous speed. Here are two videos showing exactly what I mean: I switch between rendering the advected color vector and the the level set where phi < 0 (i.e. the inside of the liquid, rendered as black).   https://www.dropbox.com/s/qz7ujya1oyommls/levelset0.mp4?dl=0 https://www.dropbox.com/s/g2pzl121sp9td6g/levelset1.mp4?dl=0   From what I've read from the papers, the problem is that after advection, phi doesn't hold the signed distance anymore and needs to be reinitialized. However, I've got no idea how one would do it. Some papers have mentioned fast marching method but from what I can understand, it doesn't suit well for GPUs (my solver is completely GPU based). Some other papers mentioned Eikonal solvers in their references but I literally have no idea what/how to proceed.   Any help would be greatly appreciated: bonus points if anyone can link to a tutorial/instructional text that isn't a high-level implementation paper glancing over details.   Here's how I've defined the signed distance initially: float2 p = input.position.xy - pointCenter; float s = length(p) - radius; return float4(s, s, s, 1.0f);   Thanks in advance!
  15.   Thanks for your reply! Yes, I know how to find the direction vector when I've got access to both back and front texture coordinate data. The GPU gems text refers to only a single rayData texture by combining the data like I have in the original post and then providing the camera's position in texture space in a constant buffer. J. Zink's blog (http://www.gamedev.net/blog/411/entry-2255423-volume-rendering/) uses the same method it seems: at least he is describing the usage of the camera's position in texture space.   For now, I've implemented the 2-rendertarget/texture version and I seem to be getting decent results. However, it seems for different data sets I need to scale the alpha component differently (see code below): float4 rayEnd = rayDataBack.Sample(linearSampler, input.texCoord); float4 rayStart = rayDataFront.Sample(linearSampler, input.texCoord);   float3 rayDir = rayEnd.xyz - rayStart.xyz; float dist = length(rayDir); rayDir = normalize(rayDir);   int steps = max(max(volumeDimensions.x, volumeDimensions.y), volumeDimensions.z); float stepSize = dist / steps;   float3 pt = rayStart.xyz;   float4 color = float4(0.0f, 0.0f, 0.0f, 0.0f); for(int i = 0; i < steps; ++i) {   float3 texCoord = pt; texCoord.y = 1.0f - pt.y; float vData = volumeData.SampleLevel(linearSampler, texCoord, 0).r;   float4 src = float4(vData, vData, vData, vData); src.a *= 0.01f; // <- this scalar needs to be different for different data sets   color.rgb += (1.0f - color.a) * src.a * src.rgb; color.a += src.a * (1.0f - color.a);   if(color.a >= 0.95f) { break; }   pt += stepSize * rayDir;   if((pt.x > 1.0f) || (pt.y > 1.0f) || (pt.z > 1.0f)) { break; } }   return color; For example: teapot (scalar = 0.01f): http://postimg.org/image/668oi4imz/ foot (scalar = 0.05f): http://postimg.org/image/tjgggjmy5/   Is it normal to tweak the rendering for specific data sets or should one value fit all?   Thanks.
  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!