Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 29 Mar 2007
Offline Last Active Yesterday, 12:44 AM

#5066018 Generating Cube Maps for IBL

Posted by MJP on 29 May 2013 - 08:09 PM

The general approach is to the take the input cubemap as if it contained radiance at each texel, and pre-integrate the radiance with some approximation of your BRDF. Unfortunately it's not possible to pre-integrate anything except for plain Phong without having to also parametrize on the view direction, so the approximation is not that great if you want a 1:1 ratio of cubemaps. Most people will use CubeMapGen to convolve with a phong-like lobe, using a lower specular power for each successive mip level. You can roll your own if you want, it's not terribly difficult. I actually made a compute shader integrator that we use in-house. Just make sure that you account for the non-uniform distribution of texels in a cubemap when you're integrating, otherwise the result will be incorrect.

#5065166 Question about redundant texture binds.

Posted by MJP on 27 May 2013 - 12:43 AM

I'm sure at some point there are redundancy checks, but Nvidia and AMD still recommend that you do the checking yourself if you want best performance.

#5065164 Rendering Frames Per Second

Posted by MJP on 27 May 2013 - 12:40 AM

That D3DPRESENT enum controls which VSYNC mode to use when presenting. if you use DEFAULT or ONE the GPU will wait until the next vertical refresh period to present, which effectively limits you to the refresh rate of the monitor (which is surely 60 in your case). The point of this is to prevent horizontal tearing.

#5065163 What semantic(s) to use in order to rendre in multiple targets ?

Posted by MJP on 27 May 2013 - 12:37 AM

In the future it would be helpful if you post the compilation errors, since it would help people to quickly figure out what's wrong with your code.

Your problem is that you're taking your "psOut" structure as an input parameter to your pixel shader, and SV_Target is an invalid input semantic for a pixel shader. Just remove it as a parameter, and declare it locally to your function:



psOut pixel_main(psIn IN)
    psOut OUT;
    OUT.color = getColor(IN) ; ;
    OUT.RTposition = IN.RTposition ;
    OUT.RTnormal = float4(IN.normal, 1.0) ;
    return OUT ;

#5064814 Locating possible memory leak in "Stalker" shaders

Posted by MJP on 25 May 2013 - 12:08 PM

I guess I should add that shaders themselves can't allocate anything, but it's possible that the engine allocates things based on what's in the shader. For instance the engine might inspect the shader to look at how many textures it uses, and allocate an array of pointers that get used for storing pointers to textures. However it's impossible to know about these sorts of things without knowing more about the engine, or seeing the engine code.

#5064698 Locating possible memory leak in "Stalker" shaders

Posted by MJP on 24 May 2013 - 11:59 PM

Shaders can't allocate memory, so there's no way for them to create a memory leak.

#5064697 Calculate bitangent in vertex vs pixel shader

Posted by MJP on 24 May 2013 - 11:37 PM

On the particular platform that I work with it's generally a win to minimize interpolants, so I've been going with the latter approach. I'm honestly not sure what Maya or the other DCC packages do, I've never taken a close look.

#5064281 DX11 - Depth Mapping - How?

Posted by MJP on 23 May 2013 - 03:51 PM

You want the second one: depthPosition.w / 1000.0

#5064275 creatInputLayout returns a NULL pointer

Posted by MJP on 23 May 2013 - 03:40 PM

Not all DXGI formats are supported for use in a vertex buffer. You need to look at the table entitled "Input assembler vertex buffer resources" from here to see which formats are supported. In your particular case the issue is that _SRGB formats aren't supported. You need to use DXGI_FORMAT_R8G8B8A8_UNORM instead, and then apply sRGB->Linear conversion manually in the shader.

Also you should consider turning on the debug device by passing
D3D11_CREATE_DEVICE_DEBUG when creating your device. When you do this, you'll get helpful error messages in your debugger output window whenever you do something wrong.

#5064269 DX11 - Depth Mapping - How?

Posted by MJP on 23 May 2013 - 03:16 PM

I'm not really sure what your exact problem is or what you're trying to accomplish here. However I can tell you that dividing post-perspective z by 25.0 is not going to give you anything meaningful. Normally you would divide by w in order to get the same [0, 1] depth value that's stored in the depth buffer. However this value isn't typically useful for visualizing, since it's non-linear. Instead you usually want to use your view-space z value (which is the w component of mul(position, projectionMatrix), AKA depthPosition.w) and divide it by your far-clip distance. This gives you a linear [0, 1] value.

#5063984 Aliasing confusion

Posted by MJP on 22 May 2013 - 06:15 PM

1. Yes, that's aliasing. Essentially you have some signal that's stored in a texture, and the fragment shader samples that signal. If the fragment shader doesn't sample at an adequate rate relative to the rate at which the signal changes (the frequency of the signal), then you get aliasing. A black and white checkerboard is basically a series of step functions, and sampling step functions will always result in aliasing because they have an infinite rate of change.


2. Yes, that's a similar case of aliasing. In this case the aliasing stems from the rasterizer, which samples some signal (in this case a triangle) at a fixed set of sample positions (usually aligned to the center of pixels). A triangle edge is also a step function, so you get the same aliasing problem.

3. Yes, texture filtering is a very important means of anti-aliasing. Filtering reduces aliasing by attenuating high-frequency components of a signal, but can also remove detail that's present in those higher frequencies. However if you oversample by sampling at a higher rate than your screen resolution and then filter the result when downsampling, you can reduce aliasing while still preserving details. This is the basic premise behind both texture filtering and MSAA. The former works with texture sampling, and the later works with the rasterizer. With texture sampling you typically take 4 texture samples in a 2x2 grid and perform linear filtering, which allows you to avoid aliasing as long as your sampling rate is no less than 1/2 the resolution of your texture. So if you had a 512x512 texture, bilinear filtering won't alias as long as the texture is mapped to an area of the screen that's >= 256x256 pixels. If you need to go lower, you need to use pre-filter by creating a chain of mip maps. This "no less than 1/2 the texture resolution" rule is why you create each mip level to be 1/2 the size of the previous level.

If you're interested in this sort of thing, I have a rather lengthy series of articles on my blog regarding sampling and aliasing. It may put you to sleep, but there's a lot of (hopefully useful) info in there.

#5063982 Mip Mapping in the Domain Shader

Posted by MJP on 22 May 2013 - 05:54 PM

"default" mip-mapping only works in a pixel shader, since it's based on screen-space derivatives of your UV coordinates that are computed from neighboring pixels. For any other shader type you have to compute the mip level manually.


I would suggest reading through this article for some ideas regarding mip selection for displacement mapping.

#5061929 Rendering to and reading from a texture (or two)

Posted by MJP on 14 May 2013 - 05:53 PM

I'm not sure what you mean by "big performance hit", but writing to a texture and then reading from it later is a very common operation that a GPU can certainly handle. In general the performance will be dictated by the bandwidth available to the GPU, so higher-end GPU's will take less of a performance hit. Switching render targets every frame won't save you any bandwidth, and you'll just end up using more memory.

#5061368 Baking Occlusion Maps

Posted by MJP on 12 May 2013 - 05:26 PM

Not that I know of. If you download the embree source code you'll see there's 4 main projects in there: common, renderer, rtcore, and viewer. You're really only interested in rtcore, which is the core kernel used for casting rays and getting the intersection. However it depends on common, so you need that as well. Basically what I did was I built rtcore and common libs for debug and release, then copied them to a Lib folder that I made, and then copied all of the headers from common and rtcore into an Include folder that I also made. Then I just linked to common.lib and rtcore.lib in my own project.

In your code, you'll want to create an acceleration structure from your scene meshes that you can cast rays into. To do this you call rtcCreateAccel and pass it arrays of embree::BuildTriangle and embree:BuildTriangle, with each containing the vertices and triangles of your scene. There's a lot of different options for creating the acceleration structure, I currently use BVH4 since it builds quickly. There's examples in the docs, but this is what I use for the parameters:


rtcCreateAccel("bvh4.spatialsplit", "default", bvhData.Triangles.data(), totalNumTriangles, vertices.data(), totalNumVertices, empty, false);

Once you have your acceleration structure, you can query it for an intersector like this:


bvhData.Intersector = bvhData.BVH->queryInterface<Intersector>();

And that's it, you're ready to cast rays. For occlusion you can just call intersector->occluded and pass it an embree::Ray to get a bool telling you if the ray is occluded. To get the full intersection you call intersector->intersect.

#5061346 Baking Occlusion Maps

Posted by MJP on 12 May 2013 - 03:18 PM

I have a project I've been working on at home that uses Intel's embree ray-tracing library to pre-bake various kinds of GI. I fired it up with Crytek's version of Sponza (~262K triangles) and baked the equivalent of a 730x730 AO map using 625 rays per sample point, and it completed in about 18 seconds using all 4 cores of my Core i7 2600. So yeah, I think you can do better than 3 minutes. tongue.png


The way you're baking AO definitely works and is fairly easy to get going, but it can be really slow since there's a lot of overhead from setting up the GPU and reading back the results for every sample point that you bake. The GPU also never really gets fully utilized, since it has keep serializing. A ray-tracer on the other hand can be really fast for baking occlusion since it's trivial to parallelize. Embree is very very fast (especially if you only want to know if a ray is occluded instead of finding the closest intersection point), and GPU ray-tracers can be extremely fast as well. At work we use Optix to bake GI, and we can bake multiple passes for huge scenes in a minute or two on a beefy GPU.

I should also point out that it sounds like you're not quite baking AO correctly based on your description. To get a proper AO bake for a sample point, you need to sample occlusion in all directions in the hemisphere surrounding the normal. With your approach you'll only get directions within the frustum which can only cover 90 degrees, so you'll miss the sides. Typically when using a rasterization approach you'll render to a "hemicube" by rendering in 5 directions for each sample point. You'll also want to make sure that you weight each occlusion sample by the cosine of the angle between the sample direction and the surface normal (N dot R), otherwise it won't look right. With a ray-tracer using monte carlo integration you can actually "bake" the cosine term into your distribution of sample ray directions as a form of importance sampling, which lets you skip the dot product and also get higher quality results.