Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 04 Apr 2007
Offline Last Active Today, 07:45 AM

#4908091 UpdateSkinnedMesh vs GPU Update

Posted by InvalidPointer on 31 January 2012 - 12:01 PM

I'm going to echo Adam here and say 'depends what else you're doing.' 'Same' does not answer questions like the relative power of the GPU/CPU combination(s) installed, extra data availability requirements or additional workload.

#4907042 Nvidia SDK sample not working correctly

Posted by InvalidPointer on 28 January 2012 - 10:06 AM

You can check this more thoroughly by running the sample in the reference rasterizer. If it looks fine there, smells like a driver and/or hardware problem.

#4897083 is there a way to load shaders faster?

Posted by InvalidPointer on 24 December 2011 - 09:47 AM

Yes. Precompile it using fxc or D3DCompile() in your own utility and load that instead. That won't help you if most of your time is spent reading from disk, but I'd wager the compilation and optimization's what's killing you in this instance.

#4896210 [DX11] Battlefield 3 GBuffer Question

Posted by InvalidPointer on 21 December 2011 - 10:42 AM

In most lighting models, "Specularity" can't be expressed by a single number. Traditional models use a "specular mask" (intensity) and a "specular power" (glossiness).

Many of the more complex lighting models use two similar but different parameters -- "roughness" and "index of refraction" instead. IOR is a physically measurable property of real-world materials, and using some math, you can convert it into a "specular mask" value, which determines the ratio of reflected vs refracted photons. This alone only describes the 'type' of surface though (e.g. glass and stone will have different IOR values).
Alongside this, you'll also have a 'roughness' value, which is a measure of how bumpy the surface is at a microscopic scale. If you had rediculously high resolution normal maps, you wouldn't need this value, but seeing as normal maps usually only describe bumpiness on a mm/cm/inch scale, the roughness parameter is used to measure bumpiness on a micometer scale.

In simple terms, you can think of the "specular" value as being the same as a "spec mask" or "IOR", and you can think of the "smoothness" as being equivalent to "spec power" or "roughness".

As for the sky-visibility term, I assume it's an input for their ambient/indirect lighting equation.

As an addendum-- the Blinn/Phong specular power bit is actually an approximation to evaluating a Gaussian distribution with a specific variance. As Hodgman points out, pretty much every specular model on the market today works on the idea of microfacet reflections-- that all surfaces are actually perfect, mirror-like reflectors. The catch is that when you look at them at a fine enough level of detail, the surfaces themselves are comprised of really, really tiny facets that reflect light in (theoretically) all directions. The Gaussian term from before uses some probability to get around having to evaluate a bunch of reflection math, essentially estimating what percentage of the surface is actually oriented in such a way that it will reflect light towards the viewer (we can do this because of the Law of Large Numbers, which is slightly outside scope). This is actually what the half vector describes, if you think about it. Recall the Law of Reflection for a moment; this states that the angle of incidence is equal to the angle of exitance/reflectance. Therefore, reflecting the light vector around the half vector would yield the view vector.

This has been Physically-Based Shading 101 with InvalidPointer, thanks for playing! :)

EDIT: Some further clarifications.

#4894550 How do I do this?

Posted by InvalidPointer on 16 December 2011 - 11:33 AM

The correct answer is 'Blizzard has a really, really absurdly good art team' as there's nothing super-super technical going on. You can achieve a very similar effect by drawing a little fire dragon mesh like that with additive blending and perhaps some basic shader black magic for coloration. The lightning arcs and little fire bits appear to be made using flipbook textures (the Unity/UDK resources linked above should have some more on that, it's mostly some very basic addition and multiplication applied to where to 'look' to read color data from a texture. If not, Google it and you should get like ten skillion hits) and other than that it's really all in the texture quality.

From a 'how do I do any of this' standpoint, have a gander at oriented and view-facing billboarding, which should get you some of the basics. An understanding of matrix math would be very, very helpful for understanding why things are laid out in the way they are. That obviously won't handle some of the more 'boring' mesh stuff, but hopefully you'll already have some exposure to drawing that sort of thing.

EDIT: Rereading my post a bit, I think some bits came off overly technical for a total, start-from-scratch newcomer-- if I'm being a little too opaque feel free to ask for clarification.

#4890190 Kinect with DirectX11

Posted by InvalidPointer on 03 December 2011 - 01:17 PM

UpdateResource() or Map() should do more or less exactly what you need.

#4888468 Shaders and dynamic branching : good practise ?

Posted by InvalidPointer on 28 November 2011 - 09:59 AM

Thank you for your replies !

so I rewrited my shaders. They all use these 2 functions :

<snip for conciseness>

compilation defines :
CM = color map
AT = alpha testing color map
SM = specularity map
(CM and AT are mutually excusive)

branching :
g_textureMask & 1 -> uses point sampling for color map
g_textureMask & (1<<1) -> uses point sampling for specularity map

I hope this is betterPosted Image

Not... quite. As was mentioned, you can ditch all the branching, etc. outright by just handling samplers yourself at a higher level. Instead of defining a linear and point sampler, reframe things so you have a 'diffuse map' sampler and a 'spec map' sampler. Then, instead of dinking around with the flag parameter, bind the actual samplers yourself via SetSamplerState() or PSSetSamplers() and eliminate the need for bitwise operations or dynamic branching for that case outright. You can do something slightly similar for alpha tests by reformulating your shader like so:

float4 getColor( in float2 uv )
#ifdef CM
	float4 color =  g_ColorMap.Sample(samColor,uv);

#ifdef AT

	// this is a marginally less verbose method of killing pixels and is functionally identical to what you wrote originally
	clip( color.a - AT_EPSILON );


	return color;
	return float4( g_vMaterialColor, 1 );


Now you can again pull this indirection up and out of the shader. Instead of changing samplers, you can now just pick an actual pixel shader directly based on your app-side 'features' bitfield. Notice how much cleaner the code is overall.

#4887282 Shaders and dynamic branching : good practise ?

Posted by InvalidPointer on 24 November 2011 - 09:48 AM

Actually I'd suggest dinking around with the texture coordinates over sampler switches, basically snapping them to the center of a particular texel instead of feeding the 'raw' texture coordinate to the sample function. Of course, it can certainly make sense to take care of this at a higher level if you know you'll only need one kind of filtering per geometry batch (that is, you're not getting this mask value from another texture or a vertex attribute) via SetSamplerState() in D3D9 and PSSetSamplers() in D3D10+.

EDIT: And, to be honest, you're doing one of the most textbook cases of Premature Optimization™ I've seen in some time-- look into getting GPUShaderAnalyzer or Parallel nSight based on your GPU manufacturer and get some hard data about the performance characteristics of what you're writing. Integer instructions are also somewhat slower AFAIK, but that knowledge could be a little outdated now; bitmasks like what you're doing are still something of a last-resort solution. Shader permutations aren't necessarily a problem, either, but can be something of a pain when you have 10,000 handwritten permutations to juggle. You can use conditional compilation and some sort of runtime indexing scheme without too much trouble, I'd wager, and can stand to reduce the overall shader complexity to a considerable degree.

#4882272 C++ unordered vector erasure

Posted by InvalidPointer on 09 November 2011 - 02:57 PM

Not sure if there's a standard algorithm for it, but it's only two lines of code:

std::swap( container.back(), *iterator );
container.resize( container.size()-1 );

Dumb question, but is there any reason why you aren't pop_back()-ing in the example?

#4880299 Precomputed Radiance Transfer Question

Posted by InvalidPointer on 03 November 2011 - 03:45 PM

You're just writing out SH coefficients in RGBA, just like components in a normal vector-- granted you're going to need a few RTs for acceptable results since you need something like second-order coefficients to get results even approaching usable. This cubemap thing makes me feel like we're talking about entirely different concepts, though, so perhaps more explanation of what you're trying to store is in order.

I understand what I am rendering, but what I don't get it how. Do I create a cube map, and wrap that around the SH, or do I just pre-bake the vertices and render the textures

All SH work is done with coefficients, look at what the DXSDK samples do. The confusion may stem from how the basis function values are precomputed and stored in a texture for later scaling in the shader; it's entirely possible to evaluate the Associated Legendre Polynomials directly via ALU and may actually be faster depending on hardware. I'm still a little lost as to what you're trying to store here, though. Is this like SH lighting calculated by way of some deferred shading whatsit? Visibility coefficients? SH lightmap?

#4879464 Differences between Device9::ProcessVertices and Device9::DrawPrimitive?

Posted by InvalidPointer on 01 November 2011 - 06:05 PM

You would still need to use DrawPrimitive(), but you can simplify the vertex shader.

What do you mean with "you can simplify the vertex shader"?

I'm sorry by my silly questions but I'm new in DirectX.

Also note that DrawPrimitive() will be significantly slower than DrawIndexedPrimitive() unless the number of primitives is very small, because the index buffer enables the graphics card to cache results from the vertex shader.

Yes I knew that I just use DrawPrimitive() as an example but thanks anyways.

Well remember that you've already done all the fancy skinning, etc. inside ProcessVertices() so you can make a very simple pass-through vertex shader that just shuttles things off to the rasterizer/pixel shader stage. There's no need to run your transformations, etc. again.

#4879254 Difference between Hardware vertex processing and Software vertex processing?

Posted by InvalidPointer on 01 November 2011 - 08:54 AM

The only times you'll ever want to use software vertex processing are A) if you need to use a vertex shader model specification higher than what the card currently supports, or B) you're using debug tools that require it. Otherwise, use hardware processing and on top of that prefer 'pure' hardware vertex processing for a performance boost. This does come with a slight programming effort cost, as I think you lose the ability to query some related device state IIRC, but in general it's worth this loss.

#4879249 Differences between Device9::ProcessVertices and Device9::DrawPrimitive?

Posted by InvalidPointer on 01 November 2011 - 08:49 AM

ProcessVertices just runs the vertex shader you specify on each element in the buffer(s) you bound to the pipeline, then saves out the result in the pDestBuffer argument-- see the MSDN documentation here. In most cases, I'd wager that you'll never actually need to do this unless you absolutely must use features in a more modern vertex shader model specification than is available or need to do some other special processing that doesn't work well in your traditional graphics pipeline. Additionally, I recall hearing that using this requires that software vertex processing be forced for the device, but I don't see any mention of this on the mentioned MSDN documentation page, so you'll probably want to test this and find out.

DrawPrimitive and friends, on the other hand, will actually go through the entire set of vertex and pixel shader stages, and will thus rasterize everything you feed it. This is the primary drawing workhorse in D3D; for the most part everything else exists to configure how these functions behave when invoked.

#4851324 What does Uint32 mean?

Posted by InvalidPointer on 19 August 2011 - 12:55 PM

Note to OP: At this point in time, the following may be a bit confusing for you, so feel free to skip for now. Realistically, you aren't likely to run across the edge cases below unless you explicitly go off looking to find them.

do you mean UINT32 ?
UINT32 is an unsigned int that is 4 bytes long

Just to add to this in case it isnt clear, a byte = 8bits, thus 4 bytes times 8 bits, is the 32 part of uint32.

That isn't exclusively true, no. For most common architectures, that holds, but there are a few more exotic designs (typically in embedded systems) that have bytes be larger or smaller due to indexing concerns. This is why there's all this confusion re: actual sizes of the built-in types for C and C++; the standards were written in such a way that compilers would be free to change this depending on the needs of the hardware while still being source-compatible with other processor targets. Incidentally, sizeof() is also defined so as to return size as multiples of char (also defined to be the smallest native type the architecture supports) for the same reasons, and *NOT* give a value in 8-bit bytes. The More You Know!

The exception to this is the compiler-dependent inttypes.h, which provides a set of variable types guaranteed to be (I believe, I'm no C expert and this isn't in C++ formally) 8, 16, 32 or 64 bits in length. Unless you're working *with those types directly* all bets are off, though with some knowledge of the compiler itself and the target architecture you can take an educated guess.

EDIT: For the sake of completing the Irritable Language Lawyer Nitpickery, some compilers also have extensions (MSVC in particular springs to mind) that also give you some fixed-width types to play around with. If you're using Visual C++, look into __int8, __int16, __int32 and __int64.

#4756795 You Are Old ...

Posted by InvalidPointer on 10 January 2011 - 01:07 PM

In the immortal words of wez in the linked StackOverflow topic, who is general failure, and why is he reading my disk?!