Jump to content

  • Log In with Google      Sign In   
  • Create Account


MJP

Member Since 29 Mar 2007
Offline Last Active Today, 01:54 AM

Topics I've Started

Depth buffer precision

22 March 2010 - 01:52 PM

Lately I've been doing some research into reducing amount of error when reconstructing position from depth for deferred rendering. My results indicate that rendering linear depth (view-space z divided by distance to far-clip plane) into a 32-bit floating point surface gives great results, but perspective z/w from a 24-bit integer depth buffer can have a lot of error as you get closer to the far clip plane. This isn't surprising, since the precision issues due to the non-linearity of perspective z/w are well-known. I would just go with linear z in a color texture, but it would be nice to be able to save the memory and bandwidth required for an additional RT is pretty appealing for consoles. After reading through this delicious tidbit from Humus I re-ran my tests with a floating-point depth buffer and the near and far planes swapped. The results were great, but unfortunately it looks like it isn't really doable on PC's in D3D9. The caps spreadsheet that comes with SDK says that ATI DX10 GPU's support the D3DFMT_D24FS8 format, but nothing else. And even if I could use it on PC, I can't sample from it since there's no driver hacks that use it. I don't mind that last part so much since I already have a fallback in place, and PC can handle the extra RT. However to avoid having horrible z-testing precision I would have to put in all sorts of nasty platform-specific code that would let me use normal projection parameters/depth testing states on PC and backwards parameters on platforms that support float depth buffers. Not impossible, but potentially pretty nasty. Combining that with the added cost of recovering liner depth from z/w depth buffer values makes me start to lean towards the option of just using a seperate RT with linear depth on all platforms. So here's what I'd like to discuss: 1. Are there any ways to use floating-point depth buffers on the PC that I don't know of? How about in D3D10/D3D11? I don't even see a DXGI_FORMAT that specifies a 24-bit fp depth format. 2. Are there any other options for getting a better distribution of precision in a depth buffer? Thanks in advance. [smile] EDIT: I'm going to write a blog post real soon with the results of my testing, but if anyone would like to see some of the images I can post them here. EDIT 2: Blog post is up [Edited by - MJP on March 23, 2010 2:44:44 AM]

XNA 4.0 CTP available for download

15 March 2010 - 09:48 AM

Announcement Download Looks like it includes... 1. VS 2010 Express (apparently it will only install this if you don't already have VS 2010 installed) 2. Windows Phone Emulator 3. Silverlight for Windows Phone 4. XNA GS 4.0 Go get started early on those Windows Phone games! Also if you're not up-to-date on XNA GS happenings, keep in mind that in XNA 4.0 the graphics stuff was pretty heavily refactored so you won't be able to do a straight conversion of XNA 3.x projects.

February 2010 DirectX SDK

05 February 2010 - 01:59 PM

Looks like it just got posted. Hooray for D3D11 support in PIX! [grin]

NVPerfHUD vs ATI PerfStudio

18 August 2008 - 03:01 AM

Hey everyone. Recently I've been considering replacing my PC's 8800 GTS with a new ATI HD 4870. Performance-wise I think I'd be satisfied, but I'm a bit concerned because this PC also happens to be my development PC and I've gotten used to having Nvidia's NVPerfHUD (which is a very handy tool). So I was wondering if anybody with experience using both NVPerfHUD and ATI's PerfStudio could tell me how the two stack up in practical use. Is PerfStudio as easy to use and set up? Does it require an instrumented driver? Is it stable? Any insights you guys could provide would be greatly appreciated. [smile]

LogLuv Encoding For HDR

05 July 2008 - 07:35 AM

For my recent project (targeting the 360 and PC through XNA) I've been doing some experimenting with alternative formats for HDR. If I were just doing PC I'd probably just use fp16 and be done with it, but for the 360 I've been trying out some alternatives since fp16 has some disadvantages on that platfrom (and since the special fp10 frame buffer format isn't available through XNA [flaming]) Anyway my most recent adventure has been with LogLuv encoding (Heavenly Sword's famous NAO32 technique). I've managed to come up with a working implmentation based on Christer Ericson's optimized Cg code, however I've been getting artifacts in the final result: I'm guessing that I'm running into discontinuites somewhere...however I can't seem to figure out where. If anyone has any experience with this and could point in the right direction, I'd be eternally grateful. This is the shader code I'm using:
// M matrix, for encoding
const static float3x3 M = float3x3(
    0.2209, 0.3390, 0.4184,
    0.1138, 0.6780, 0.7319,
    0.0102, 0.1130, 0.2969);

// Inverse M matrix, for decoding
const static float3x3 InverseM = float3x3(
	6.0013,	-2.700,	-1.7995,
	-1.332,	3.1029,	-5.7720,
	.3007,	-1.088,	5.6268);	

float4 LogLuvEncode(in float3 vRGB) 
{		 
    float4 vResult; 
    float3 Xp_Y_XYZp = mul(vRGB, M);
    Xp_Y_XYZp = max(Xp_Y_XYZp, float3(1e-6, 1e-6, 1e-6));
    vResult.xy = Xp_Y_XYZp.xy / Xp_Y_XYZp.z;
    float Le = 2 * log2(Xp_Y_XYZp.y) + 128;
    vResult.z = Le / 256;
    vResult.w = frac(Le);
    return vResult;
}

float3 LogLuvDecode(in float4 vLogLuv)
{	
	float Le = vLogLuv.z * 256 + vLogLuv.w;
	float3 Xp_Y_XYZp;
	Xp_Y_XYZp.y = exp2((Le - 128) / 2);
	Xp_Y_XYZp.z = Xp_Y_XYZp.y / vLogLuv.y;
	Xp_Y_XYZp.x = vLogLuv.x * Xp_Y_XYZp.z;
	float3 vRGB = mul(Xp_Y_XYZp, InverseM);
	return max(vRGB, 0);
}

I've tried playing around with clamping the values of Xp_Y_XYZp to 1e-6 and also the value of vLogLuv.y before division, but it hasn't gotten rid of the artifacts. I know Marco (Heavenly Sword dev) mentioned you had be careful to avoid carry problems if you're linearly filtering, however I'm getting these artifacts even when I just do a straight decode of the whole render target without bloom or tone-mapping. So yeah...I'm basically stumped. Any mathematical geniuses out there who can help out a poor soul? [smile] By the way, this is the original paper describing the LogLuv format.

PARTNERS