Jump to content

  • Log In with Google      Sign In   
  • Create Account


jollyjeffers

Member Since 16 Mar 2000
Offline Last Active Jul 02 2009 11:03 PM
*****

#4422418 HLSL ps_2_0 running shader twice

Posted by jollyjeffers on 18 March 2009 - 01:57 AM

Quote:
Original post by woytah
Well as i can see it has no effect. Maybe because it is working on the same sampler and therefore second pass outputs the same result as the first one.
Quite possible - you'd want to use render-to-texture and a technique most people refer to as "ping ponging". In the first pass you render from A to B and in the second you render from B to A, thus the 2nd pass gets to see the output of the 1st. It requires intervention by the application to manage render targets though.

Quote:
Original post by woytah
If second pass would write something to frame buffer, i could then take what is already in frame buffer and use it in second pass in shader... but is that possible?
No, you can't do this. Direct3D is quite strict about the read/write permissions such that you can't read from a source whilst you're also wanting to write to it. It's due to this restriction that the aforementioned 'ping pong' technique exists.


You should be able to write a blur shader in a single pass in ps_2_0. A lot of people will use a two-pass gaussian filter as its seperable and more efficient but you can still do it in a single pass if necessary. The main limitation here is that you can only do 32 tex2D()'s in ps_2_0 which for a regular grid limits you to a 5x5 kernel.

There was a paper by ATI from several years ago that allowed you to double your effective sample count via clever placement of sampling coordinates. This along with a sparser sample grid should allow you to blur a pixel with much more source data.

In particular, for a linearly filterable texture (basically anything except FP formats in D3D9) you can place the sample point in the middle of a pair or quad of pixels and the result is the average of all underlying pixels.

+---+---+
| x | x |
+---+---+
| x | x |
+---+---+

+---+---+
| | |
+---x---+
| | |
+---+---+


In the above diagram you make four samples (the x's) and then average them out in your own code. In the bottom diagram you make one sample (the x in the middle) and the value returned is already the average of all the 4 texels - basically saving you a bunch of TMU and ALU operations that you can then invest elsewhere [grin]


hth
Jack


#3952900 SlimDX -- A Prototype MDX Replacement Library

Posted by jollyjeffers on 01 May 2007 - 10:09 AM

Quote:
Original post by Demirug
As I am prefer not to move to far away from the original concepts of both APIs writing a multi API application would as complex as with C++.
I agree - just because .NET is a higher level language doesn't magically make this sort of thing easier.

It was difficult to make my idea clear, but basically I meant having the same design philosophy for MD3D9 and MD3D10 rather than making them source-code compatible or making some trivial "auto-porting" API.

Say a D3D9 developer picks up Promit's MD3D9 API and later wants to check out D3D10 so moves over to Ralf's MD3D10 implementation. If they were somehow 'aligned' then it'd make this transition a whole lot easier, rather than having to go back to square-1 and re-learn a whole different way of interacting with what is, under the covers, a fundamentally similar API.

Anyway... I'm going to stick this thread for a bit to encourage some further discussion. I get the distinct impression there are various members of the community who want to do something about Managed DirectX yet there seem to be a number of blocking factors involved in actually getting it moving forward. Maybe putting this topic in the spotlight will generate the right sort of interest to get things rolling...?

Cheers,
Jack




#3952395 SlimDX -- A Prototype MDX Replacement Library

Posted by jollyjeffers on 30 April 2007 - 10:01 PM

Whilst it wouldn't be easy, I think there would be an enormous amount of value in getting any MD3D9 and MD3D10 interfaces similar.

Obviously they can't be identical - but to use the same design guidelines, rules and so on could smooth out transitions as well as help those cross-targetting 9 and 10 (or would that explode on dependencies?).

Jack


#3941334 [hlsl] Cook-Torrance lighting

Posted by jollyjeffers on 17 April 2007 - 11:23 AM

Quote:
Original post by Lifepower
Jack, could you share your Cook-Torrance code for D3D9? [smile]
hmm, well maybe... just maybe...

float4 psCookTorrance( in VS_LIGHTING_OUTPUT v ) : COLOR
{
// Sample the textures
float3 Normal = normalize( ( 2.0f * tex2D( sampNormMap, v.TexCoord ).xyz ) - 1.0f );
float3 Specular = tex2D( sampSpecular, v.TexCoord ).rgb;
float3 Diffuse = tex2D( sampDiffuse, v.TexCoord ).rgb;
float2 Roughness = tex2D( sampRoughness, v.TexCoord ).rg;

Roughness.r *= 3.0f;

// Correct the input and compute aliases
float3 ViewDir = normalize( v.ViewDir );
float3 LightDir = normalize( v.LightDir );
float3 vHalf = normalize( LightDir + ViewDir );
float NormalDotHalf = dot( Normal, vHalf );
float ViewDotHalf = dot( vHalf, ViewDir );
float NormalDotView = dot( Normal, ViewDir );
float NormalDotLight = dot( Normal, LightDir );

// Compute the geometric term
float G1 = ( 2.0f * NormalDotHalf * NormalDotView ) / ViewDotHalf;
float G2 = ( 2.0f * NormalDotHalf * NormalDotLight ) / ViewDotHalf;
float G = min( 1.0f, max( 0.0f, min( G1, G2 ) ) );

// Compute the fresnel term
float F = Roughness.g + ( 1.0f - Roughness.g ) * pow( 1.0f - NormalDotView, 5.0f );

// Compute the roughness term
float R_2 = Roughness.r * Roughness.r;
float NDotH_2 = NormalDotHalf * NormalDotHalf;
float A = 1.0f / ( 4.0f * R_2 * NDotH_2 * NDotH_2 );
float B = exp( -( 1.0f - NDotH_2 ) / ( R_2 * NDotH_2 ) );
float R = A * B;

// Compute the final term
float3 S = Specular * ( ( G * F * R ) / ( NormalDotLight * NormalDotView ) );
float3 Final = cLightColour.rgb * max( 0.0f, NormalDotLight ) * ( Diffuse + S );

return float4( Final, 1.0f );
}


The above is a straight copy-n-paste from my final year's disseration at the University of Nottingham. From empirical testing it appears to be correct, but I can't say I exhaustively tested all scenarios.

Jack




#3613624 Vertex cache

Posted by jollyjeffers on 22 May 2006 - 10:57 PM

Yeah, it is pretty much as simple as you said [smile]

I recently put together a vertex cache demo. Read the main article and the follow up for details. The second part has updated code with full examples.

Bare in mind that the cache size changes between different GPU's. Nvidia's earlier chips were 16 elements, the more recent being 24 elements. ATI's appear to be 14 elements - but I dont have any solid evidence for that (damn ATI for not supporting VCache queries [flaming]).

Quote:
Original post by ClementLuminy
Do you know a little library which sort any IB in order to be used with Vertex Chache ??
As sirob suggested, you've got the Optimize methods for ID3DXMesh... but you've also got D3DXOptimizeFaces() and D3DXOptimizeVertices() for non-mesh geometry.

hth
Jack


#362915 [C++] D3D/PIX profiling helper

Posted by jollyjeffers on 08 December 2005 - 05:48 AM

Evening all, I've been doing some work on one of my projects and came up with a nifty little trick that I thought I'd share with you guys. Maybe you'll find it useful.. I would hope that everyone who's using a reasonably up-to-date version of Direct3D 9 has experimented with PIX for Windows. If not, why not? [smile] A lesser known set of features are the D3DPERF_BeginEvent(), D3DPERF_SetMarker() and D3DPERF_EndEvent() API calls. I covered them a while back in my developer journal. What I've written is hardly rocket-science, but it's one of those (I think) more useful uses of the C/C++ preprocessor. Using the following bits of code you can add a PROFILE_BLOCK in your code and it'll do the rest for you - including making sure that it cleans up correctly. D3DUtils.h (Download directly from here)
#include "dxstdafx.h"

#ifndef INC_D3DUTILS_H
#define INC_D3DUTILS_H

// These first two macros are taken from the
// VStudio help files - necessary to convert the
// __FUNCTION__ symbol from char to wchar_t.
#define WIDEN2(x) L ## x
#define WIDEN(x) WIDEN2(x)

// Only the first of these macro's should be used. The _INTERNAL
// one is so that the sp##id part generates "sp1234" type identifiers
// instead of always "sp__LINE__"...
#define PROFILE_BLOCK PROFILE_BLOCK_INTERNAL( __LINE__ )
#define PROFILE_BLOCK_INTERNAL(id) D3DUtils::ScopeProfiler sp##id ( WIDEN(__FUNCTION__), __LINE__ );

// To avoid polluting the global namespace,
// all D3D utility functions/classes are wrapped
// up in the D3DUtils namespace.
namespace D3DUtils
{
	class ScopeProfiler
	{
		public:
			ScopeProfiler( WCHAR *Name, int Line );
			~ScopeProfiler( );

		private:
			ScopeProfiler( );
	};
}

#endif
D3DUtils.cpp (Download directly from here)
#include "dxstdafx.h"
#include "D3DUtils.h"

#include <time.h>

namespace D3DUtils
{
	// Class constructor. Takes the necessary information and
	// composes a string that will appear in PIXfW.
	ScopeProfiler::ScopeProfiler( WCHAR* Name, int Line )
	{
		WCHAR wc[ MAX_PATH ];
		StringCchPrintf( wc, MAX_PATH, L"%s @ Line %d.\0", Name, Line );
		D3DPERF_BeginEvent( D3DCOLOR_XRGB( rand() % 255, rand() % 255, rand() % 255 ), wc );
		srand( static_cast< unsigned >( time( NULL ) ) );
	}

	// Makes sure that the BeginEvent() has a matching EndEvent()
	// if used via the macro in D3DUtils.h this will be called when
	// the variable goes out of scope.
	ScopeProfiler::~ScopeProfiler( )
	{
		D3DPERF_EndEvent( );
	}
}
A few notes:
  1. Just do a #include "D3DUtils.h" in the code you want to use it
  2. A random colour is created for the sampler, but PIXfW doesn't currently make use of this.
  3. I've used the dxstdafx.h PCH file that you find in the SDK. If you're not using this, then make sure you replace it with d3dx9.h, windows.h and math.h.
  4. PIXfW only monitors Direct3D calls, so theres no point in using this code to watch sections that don't contain any D3DX/D3D calls!
The usage is pretty simple. It creates an instance of ScopeProfiler on the stack such that it's destructor will automagically get called when it goes out of scope. The destructor contains a D3DPERF_EndEvent(), making sure that the D3DPERF_BeginEvent() is correctly matched...
// Watch an entire function:
void CALLBACK OnFrameRender( IDirect3DDevice9* pd3dDevice, double fTime, float fElapsedTime, void* pUserContext )
{
    PROFILE_BLOCK
    // other code goes here
}

// Watch a specific subset:
{
    PROFILE_BLOCK
    V( g_HUD.OnRender( fElapsedTime ) );
}

// Use the class directly to override the default-generated
// name:
if( g_SettingsDlg.IsActive() )
{
    D3DUtils::ScopeProfiler( L"OnFrameRender() - Setting Dialog Rendering", __LINE__ );
    g_SettingsDlg.OnRender( fElapsedTime );
    return;
}
The result, when you run a PIX Full Call Stream Capture:
(The events added by the program are highlighted in pink)
Feel free to do whatever you want with the code. Use it and abuse it - At your own risk of course [wink] Cheers, Jack


PARTNERS