• Advertisement

MJP

Moderator
  • Content count

    8572
  • Joined

  • Last visited

Community Reputation

19887 Excellent

1 Follower

About MJP

  • Rank
    XNA/DirectX Moderator & MVP

Personal Information

Social

  • Twitter
    @MyNameIsMJP
  • Github
    TheRealMJP
  1. In D3D, your normalized device coordinate space looks like this: So if you take your desired quad width and divide by the screen size, it will put you in the top-right corner. If you want to expand that to the whole screen, you can simply multiply by 2.0 and subtract 1.0 from your coordinate.
  2. A separable blur isn't a "typical" multi-pass technique where you just draw multiple times with a particular blend state. Instead, the second pass needs to read the results of the first pass as a texture. This requires changing render targets and shader resource view bindings between your draw calls. The basic flow goes something like this (in pseudo-code): // Draw the scene to the "main pass" render target SetRenderTarget(mainPassTarget.RTV); DrawScene(); // Draw the vertical pass using the main pass target as its input SetRenderTarget(blurTargetV.RTV); SetShaderResourceView(mainPassTarget.SRV); SetPixelShader(blurVerticalPS); DrawQuad(); // Draw the horizontal pass using the vertical blur target as its input SetRenderTarget(finalBlurTarget.RTV); SetShaderResourceView(blurTargetV.SRV); SetPixelShader(blurHorizontalPS); DrawQuad();
  3. 3D Mesh Intersect Inside Shader ?

    Ray/mesh intersection is totally doable on a GPU, and there has been a whole lot of research into how to do it quickly. There are even entire libraries/frameworks like Nvidia's Optix and AMD's Radeon Rays that are aimed at doing arbitrary ray/mesh intersections for the purpose of ray-tracing. There are also libraries like Intel's Embree that aim to give you really fast ray/mesh intersection on a CPU. For really fast ray/mesh intersections on both CPU and GPU you'll generally need to build some sort of acceleration structure that lets you avoid doing an O(N) lookup on all of your triangles. BVH trees tend to be popular for this purpose, although there are other options with different options in terms of lookup speed, memory usage, and building speed. All 3 of the libraries that I mentioned have "builders" that can take an arbitrary triangle mesh and build an acceleration structure for you, which you can then trace rays into. If you think you'd like to try one of these, I would suggest starting with Embree. It's very easy to integrate and use, and it's definitely going to be faster than a brute-force ray/mesh query. However it also takes some time to build the acceleration structure, which could cancel out the performance improvement. If you decide to try out your compute shader approach (which doesn't sound totally crazy), you could probably get a lot of mileage by building even a simple acceleration structure for your data in order to avoid an O(N) scenario for each query. For instance, you can bin all of your feature meshes into a uniform grid based on a simple bounding box or sphere fit to the mesh. Then for each terrain vertex you could see which bin(s) it intersects, and grab the list of feature meshes for that grid cell. Another option you can consider is to flip the probably around, and rasterize your feature meshes instead of performing ray casts. This approach is commonly used for runtime decal systems (except in screen space instead of in heightmap space), and it's not too hard to get going. Assuming that your terrain is a 2D height field, you would basically want to set up an orthographic projection that is basically rendering your terrain "top-down". Then you could rasterize your feature meshes using that projection, and in the pixel shader you could either write out some value to a render target, or use a UAV to write the info that you need to a buffer or texture.
  4. FYI this wiki page has a chart that shows you the optional feature support for different GPU families. It's for D3D12 so it doesn't guarantee D3D11.3 support, but it does at least tell you some of the hardware capabilities. In the case of conservative rasterization HW support was added for second-gen Maxwell, and unfortunately your 960M is a first-gen Maxwell.
  5. What Vilem Otte said ^^^ There's no need to create multiple devices or adapters, that will just make your life more difficult. You just need multiple swap chains.
  6. Does the "diaognal lines" artifact look like this? This is commonly known as "shadow acne", and the simplest fix is to increase your bias. PCF can introduce additional bias issues when shading surfaces that are not completely perpendicular to the light direction. It basically happens because you sample multiple points on a plane perpendicular to the light direction, and this plane will intersect with the receiver geometry. There are more complex techniques for reducing acne that are usually based around increasing the bias based on computing the slope of the receiver relative to the light direction, and you should try doing some google searching to get some ideas. As for it not looking any different, here's a comparison of what the shadow edges should look like without PCF, and with PCF: So if your shadows still look like the first image, I would double-check your parameters that you used when creating the sampler state and make sure that you're using LINEAR filtering. This will give you 2x2 bilinear PCF, similar to the 2x2 texture filtering that can be used for normal textures. The filter kernel is only 2 texels wide, so the soft falloff will only end up being 1 shadow map texel wide. If you want a wider filter, you'll have to implement that manually by sampling the shadow map multiple times. Here's an example of what you get when using an optimized 7x7 filter kernel: As for the "lightDepthValue < depthValue" check, you no longer need that when you're sampling the depth map with SampleCmp. That function is essentially performing that check for you automatically, and giving you the result as a value between 0 and 1.
  7. You have to compute the diffuse albedo (Cdiff) and specular albedo (F0) from the base color like this: float3 diffuseAlbedo = lerp(baseColor.xyz, 0.0f, metallic); float3 specularAlbedo = lerp(0.03f, baseColor.xyz, metallic); The basic idea is that non-metallics have diffuse lighting, non-colored specular, and a small F0. While metallics have no specular, colored specular, and possibly a much higher F0. Cavity is usually just multiplied with the final result of the specular term (f(l, v)). It's essentially an occlusion term that represents how much of the specular is blocked by the surface.
  8. You can't, unfortunately. You either have to accept having a fully expanded vertex buffer (which can be MUCH bigger than your original indexed VB), or you need to do a separate pass. Neither is ideal, really, There is another option, but only if you're running on Windows 8 or higher: UAV's from the vertex shader. All you need to do is bind a structured buffer (or any kind of buffer) as a UAV, and then write to it from your vertex shader using SV_VertexID as the index.
  9. DX12 Descriptor Resource Sets

    Long term, you definitely don't want to be constantly copying around descriptors into tables if you want the best possible performance. So yeah, assuming that you mean "descriptor table" instead of "descriptor heap" then #3 sounds like its the closest to that ideal. If you do it that way, then you can basically follow the general practice of constant buffers, and group your descriptors based on update frequency. Ideally you would want to be able to identify descriptor tables that never change, and only build them once. There's another (crazier) approach which is what we use at work, but it requires at least RESOURCE_BINDING_TIER_2 (so no Kepler or old Intel GPU's). Basically the idea is that you go full "bindless", where instead of having contiguous tables of descriptors, you have 1 great big table and expose that to all of your shaders. Then each shader has access to a big unbounded texture array, and it samples a texture by using an integer index to grab the right texture from the array. Then "binding" can be as simple as filling a constant buffer with a bunch of uint's. There's a bunch of wrinkles in practice, but I assure it works. You can also do some really fun things with this setup once you have it working, since it basically gives your shaders arbitrary access to any texture or buffer, with the ability to use any data structure you want to store handles.
  10. I just wanted to chime in on a few things, since I've lost too much of my time to this particular subject. I'm sure plenty of games still bake the direct contribution for at least some of their lights. We certainly did this for The Order, and did it again for Lone Echo. Each of our lights has flags that control whether or not the diffuse and indirect lighting is baked, so the lighting artists could choose to fully bake the light if was unimportant and/or it would only ever need to affect static geometry. We also always baked area lights, since we didn't have run-time support for area lights on either of those games. For the sun shadows we also bake the shadow term to a separate lightmap. We'll typically use this for surfaces that are past the last cascade of dynamic runtime shadows, so that they still have something to fall back on. Here's a video if you want to see what it looks like in practice: https://youtu.be/zxPuZYMIzuQ?t=5059. It's common to store irradiance in a light map (or possibly a distribution of irradiance values about a hemisphere in modern games), but if you want to to compute a specular term then you need to store a radiance distribution in your lightmap. Radiance tells you "if pick a direction, how much light is coming in from that direction?" while irradiance tells you "if I have a surface with a normal oriented in this direction, what's the total amount cosine-weighted light that's hitting the surface?". You can use irradiance to reconstruct Lambertian diffuse (since the BRDF is just a constant term), but that's about it. Any more complicated BRDF's, including specular BRDF's, require that you calculate Integral(radiance * BRDF) for all directions on the hemisphere surrounding the surface your'e shading. How to do this efficiently completely depends on the basis function that you use to approximate radiance in your lightmap. If you want SH but only on a hemisphere, then you can check out H-basis. It's basically SH reformulated to only exist on the hemisphere surrounding the Z axis, and there's a simple conversion from SH -> H-basis. You can also project directly into H-basis if you want to. I have some shader code here for projecting and converting. You can also do a least-squares fitting on SH to give you coefficients that are optimized for the upper hemisphere. That said I'm sure you would be fine with the Last of Us approach of ambient + dominant direction (I believe they kept using that on Uncharted 4), but it's nice to know all of your options before making a decision You don't necessarily have to store directions for a set of SG's in a lightmap. We assume a fixed set of directions in tangent space, which saves on storage and makes the solve easier. But that's really the nicest part of SG's: you have a lot of flexibility in how you can use them, as opposed to SH which has a fixed set of orthogonal basis functions. For instance you could store a direction for one SG, and use implicit directions for the rest.
  11. Whoops, sorry about that! That value is what you're going to compare against the depth value in the shadow map, so in your case you want to use "lightDepthValue". I'll update the code that I posted in case anybody else copy/pastes it. The "offset" parameter is optional, so you can ignore that. It will offset the location that the texture is sampled by a number of texels equal to that parameter.
  12. I haven't done this myself, but couldn't you just multiply the V coordinate by -1 and then add 1? That should work even for tiled UV's. The V coordinate will often be negative, but that's fine since you're probably going to use a signed representation anyway for your UV's.
  13. Is there anywhere in your code where you're setting device states before drawing your 3D square? There are several states that you can set on the context that will affect your rendering, such as blend state, depth/stencil state, rasterizer state, input layout, and vertex/index buffers, and I don't see you setting those anywhere in the code you've provided. SpriteBatch will set those states in order to do its thing (there's list of states that it will set listed here, under the section called "State management"). You'll want to make sure that you set all of the states that you need before issuing your draw call in order to ensure proper results. One thing that you can do to help with this is to call ID3D11DeviceContext::ClearState at the beginning of every frame, which will set the context back to a default state. I would also recommend enabling the debug validation layer when you create your device (but only in debug builts), and check out and warnings or errors that it reports. Another thing that can help with these kinds of issues is to use debugging tools like RenderDoc, which will let you inspect the device state at the time of a particular draw call.
  14. I'm not sure that I completely understand what you're trying to do here. Are you trying to add a penumbra to your shadow, so that the shadows don't have hard edges? If so, then the standard way to do this with shadow maps is to use percentage closer filtering (PCF for short). In very simple terms, PCF amounts to sampling the shadow map several times around a small region of the shadow map, performing the depth comparison for each sample, and then computing a single result by applying a filter kernel (the simplest filter kernel being a box filter, where you essentially just compute the average of all of the results). The easiest way to get started with PCF is to let hardware before automatic 2x2 bilinear filtering for you. You'll have to make a few changes to your code to do this: Create a special "comparison" sampler state to use for sampling your shadow depth map. You do this by specifying "D3D11_COMPARISON_LESS_EQUAL" as the "ComparisonFunc" member of the D3D11_SAMPLER_DESC structure. This specifies that the hardware should return 1 when the passed in surface depth value is <= the shadow map depth value stored in the texture. You'll also want to use "D3D11_FILTER_COMPARISON_MIN_MAG_MIP_LINEAR" to specify that you want 2x2 bilinear filtering when you sample. In your shader code, declare your shadow sampler state with the type "SamplerComparisonState" instead of "SamplerState". Change your shader code to use SampleCmp instead of Sample. SampleCmp will return the filtered comparison result instead of the shadow map depth value. So you'll also want to restructure your code so that it looks something like this: SamplerComparisonState ShadowSampler; lightDepthValue = input.lightViewPositions[i].z / input.lightViewPositions[i].w; lightDepthValue = lightDepthValue - bias; float lightVisibility = shaderTextures[6 + i].SampleCmp(ShadowSampler, projectTexCoord, lightDepthValue); lightIntensity = saturate(dot(input.normal, normalize(input.lightPositions[i]))) * lightVisibility; color += (diffuseCols[i] * lightIntensity * 0.25f); Once you've got the hang of that and you want to look into more advanced filtering techniques, you can check out a blog post I wrote that talks about some of the most common ways to do shadow map filtering (or jump right to the code sample).
  15. ResolveSubresource unfortunately doesn't work for depth buffers. If you want to do it, you need to do it manually with a pixel shader that outputs to SV_Depth. You also probably wouldn't want the average of the sub-pixel depth values, since this wouldn't make much sense. Anyway, you don't need to copy the depth resource to do #2, as long as you're not writing to the depth buffer during your water pass. If you create a read-only depth-stencil view, then you can read from it using an SRV while the DSV is still bound to the pipeline.
  • Advertisement