Jump to content

  • Log In with Google      Sign In   
  • Create Account

Calling all IT Pros from Canada and Australia.. we need your help! Support our site by taking a quick sponsored surveyand win a chance at a $50 Amazon gift card. Click here to get started!


MJP

Member Since 29 Mar 2007
Offline Last Active Today, 01:26 AM

#5012721 MSAA issues.

Posted by MJP on 20 December 2012 - 12:20 AM

The built-in resolve function is ResolveSubresource. However you can't resolve a depth buffer, and it wouldn't be useful anyway even if you could.

The simplest way to implement MSAA with deferred rendering to have a separate MSAA version of your lighting shader(s), and in that version use Texture2DMS. Then in the shader just loop over all subsamples for a given pixel, calculate the lighting, and average the result. This will be much more expensive than it could be, but it will work.


#5012614 MSAA issues.

Posted by MJP on 19 December 2012 - 04:16 PM

From what you described, it sounds like you're using deferred rendering. With deferred rendering you need to do all kinds of special-case handling for MSAA, it's not just something you can "switch on" and have it work. Not only does it require changing your shaders just to avoid runtime errors, but you need to do things like edge detection to make it more efficient than just supersampling.

For your first error, you need to set the DSV dimension to Texture2DMS if you texture you created is multisampled. It doesn't matter whether or not you want to sample it later as a multisampled texture, the DSV still needs to have the proper dimension.

For yours second error, it sounds like you're sampling the depth buffer through a shader resource view in one of your shaders using a Texture2D in your shader code. Any non-resolved texture with multisampled has be accessed as a Texture2DMS in the shader, you can't use Texture2D. Color MSAA textures can be resolved to a non-MSAA texture which can then be sampled using Texture2D, but you can't do that with a depth buffer (and you don't want tor resolve your MSAA textures anyway for deferred rendering).


#5012564 GPU's Exection Units stalling and other performance issues

Posted by MJP on 19 December 2012 - 01:58 PM

I have no idea if this is an issue on modern desktop graphic cards.


Modern desktop GPU's no longer make a distinction between dependent and non-dependent texture fetches


#5012288 Restricting Camera

Posted by MJP on 18 December 2012 - 08:49 PM

What I usually do is I store rotations about the X and Y axis as single floats, and then every frame I build a world and view matrix from those values. If you do it that way it's trivial to clamp the rotations to a certain range.


#5011798 DirectX Forward Declarations

Posted by MJP on 17 December 2012 - 12:55 PM

What I meant was that I would do it like this:

#ifndef SPRITEMANAGER_H
#define SPRITEMANAGER_H

#include <vector>

//Custom (sprite) vertex for our Vertex Buffer
struct SpriteVertex;//forward declaration

#define D3DFVF_SPRITEVERTEX (D3DFVF_XYZ|D3DFVF_DIFFUSE|D3DFVF_TEX1)

//-----------------------------------------------------------------------------------------------------------------
//Forward Declarations:
//-----------------------------------------------------------------------------------------------------------------
struct IDirect3DVertexBuffer9;
struct IDirect3DIndexBuffer9;
struct IDirect3DVertexDeclaration9;
struct ID3DXEffect;

class Sprite;

class SpriteManager
{

public:
	SpriteManager(void);
	~SpriteManager(void);

private:

	//A Vector of all sprites created
	std::vector<Sprite*> m_sprites;

	//A dynamic Vertex Buffer that changes with each fram to make is hold the co-ordinates of each sprite
	IDirect3DVertexBuffer9* m_pVB; // Buffer to hold vertices

	//A Index buffer to go with the Vertex buffer
	IDirect3DIndexBuffer9* m_pIB; // Buffer to hold Indicies

	//Vertex Declaration
	IDirect3DVertexDeclaration9* m_pVertexDecl;

	//Shader pointer  
	ID3DXEffect* g_pSpriteShader;
};

#endif//SPRITEMANAGER_H


Your problem with the effect appears to be that you've declared a pointer to a pointer to an ID3DXEffect. You have this:

LPD3DXEFFECT* g_pSpriteShader;

which is equivalent to this:

ID3DXEffect** g_pSpriteShader;

This is why I recommended not using those typedefs: they can be confusing.


#5011583 DirectX Forward Declarations

Posted by MJP on 17 December 2012 - 02:04 AM

This is more general C++ rather than DirectX-specific, but my personal take is that you probably don't want to include headers in other headers if you don't really have to. This is because doing so can...

A. Promote leaky abstractions where the implementation details of a class or group of functions can "leak out" into the code that uses it

and

B. Cause long compile times when you change something in a header, since that header might be included in other headers which spreads the changes into many .cpp files

In your particular case B probably isn't too much of an issue, since you won't be modifying the DirectX headers. A depends on the your particular class and how it fits into your project.

If you do want to go with forward-declarations and avoid including the DX headers, I would suggest that you make it easy on yourself and only forward-declare the interface classes and skip the typedef's for pointer types. That will simplify things for you, and also I think it looks less ugly than a big mess of capital letters. Either way, you'll want to make your member "m_pVertexDecl" have type "IDirect3DVertexDeclaration9*" rather than "IDirect3DVertexDeclaration9", since interface types need to be pointers. Otherwise the compiler will complain that you're trying to instantiate an abstract class.

If you have any other compiler issues, I would suggest posting them here.


#5010375 Multithreading Particles with Geometry Instancing

Posted by MJP on 13 December 2012 - 05:06 PM

With the multithreaded flag in D3D9 the device will just lock every time you access it, which means if multiple threads are trying to use it they will serialize. However if you have lots and lots of particles, you could so something like this:

- lock buffers on main thread
- on multiple threads, simulate particles and write updated vertex data to the pointer from the locked buffer
- unlock on main thread and issue draw call

This may give you some gains if you're spending a lot of time simulating the particles and writing the data into the vertex buffer.


#5010318 migrating from dx10 to 11 gives me an undebug-able error

Posted by MJP on 13 December 2012 - 01:30 PM

Sometimes your code or your shader code can cause a crash in the driver. It's not particularly easy to track down...I would suggest that you start disabling things until it stops crashing so that you can narrow it down.


#5009934 D3D11: Removing Device

Posted by MJP on 12 December 2012 - 01:43 PM

This generally means that the display driver crashed or hung, and you got a TDR. This can be from something simple, like a draw call with a shader that uses a shader with a really long loop that takes so long that the drive times out. Or it might be from a bug in the driver itself. Or it could even be something in the emulator. The best advice I can give is to start removing bits of code until you can narrow it down further.


#5009703 How are games compiled for multiple operating systems at once?

Posted by MJP on 11 December 2012 - 11:00 PM

Just to add to the above posts...while you can certainly author platform-specific code using the pre-processor, in practice that's really messy. It's not hard to imagine how convoluted a real window-creation function would look if you just put the code for multiple platforms all in the same places with #if's and #ifdef's thrown in everywhere. So it's generally better (IMO) to avoid that whenever possible by using other means to selectively compile code. For instance at my current company, we tag platform-specific cpp files with a suffix that tells our build system what platform it should be compiled for. That way each file can contain a whole bunch of implementation-specific code for a single class or a group of related functions.


#5008974 Constant Buffers

Posted by MJP on 09 December 2012 - 09:58 PM

You probably just need to transpose your matrices. By default shaders expect column-major matrices in constant buffers, which means transposing row-major matrices when setting them into a constant buffer. The effects framework does this for you, so a lot of people hit this bug when handling constant buffers themselves for the first time.

FYI, you can tell the compiler to expect row-major matrices using a compile flag. You can also mark the matrix with "row_major" in your HLSL code to do the same thing on a per-matrix basis.


#5008642 Disappearing Data in Structured Buffers

Posted by MJP on 08 December 2012 - 06:16 PM

This isn't your problem, but you don't need to set the "ElementOffset" or "ElementWidth" members of D3D11_BUFFER_SRV. The structure is defined like this:

typedef struct D3D11_BUFFER_SRV
	{
	union
		{
		UINT FirstElement;
		UINT ElementOffset;
		}  ;
	union
		{
		UINT NumElements;
		UINT ElementWidth;
		}  ;
	}  D3D11_BUFFER_SRV;

Since they're in unions, you can only set one or the other in each union pair. For a structured buffer you want to use "FirstElement" and "NumElements".

Anyway, I suspect your problem is with your C++ struct. If you check sizeof(SInstance), you'll find that it's 80 bytes in size. This is because the alignment requirements of XMMATRIX (XMMATRIX is already 16-byte aligned, you don't need to add the alignment manually) will cause the struct to have 16-byte alignment, which will cause the compiler to insert 12 bytes of padding after dTemplateType. However if you were to declare something like this in HLSL..

struct SInstance
{
	float4x4 matLocation;
	int dTemplateType;
};

...this struct will be 68 bytes in size. Structs for structured buffers don't have 16-byte alignment requirements, that was an incorrect assumption on your part. HLSL really only works in terms of 4-byte values, so structs used for structured buffers will pretty much always have 4-byte alignment. If you stick to using 4-byte types in your C++ struct, you should be fine. This means you should avoid the DirectXMath SIMD types like XMVECTOR and XMMATRIX, since they have 16-byte alignment. Try changing your struct to this:

struct SInstance
{
	XMFLOAT4x4 matLocation;
	int dTemplateType;
};

If you do this, the stride of your structured buffer should match the stride expected by your shader. If you have a mismatched stride, the runtime will transparently set your buffer to NULL which will cause you to get 0's when your shader attempts to access it. If you create the device with the DEBUG flag, the runtime will output an error message to tell you that this has occurred.


#5008581 Does stretchrect or UpdateRect take out alpha? Please help!

Posted by MJP on 08 December 2012 - 01:53 PM

StretchRect doesn't really "draw" anything, it basically just does a copy. It won't use alpha-blending even if you enable it with blend states.

I would suggest using ID3DXSprite for something like this. It supports blending, and more complex transformations such as rotations.


#5008578 SSAO and skybox artifact

Posted by MJP on 08 December 2012 - 01:49 PM



A warp consists of either 16 or 32 threads grouped together.

I think you mean "32 or 64" Posted Image


I thought a Wavefront on AMDs architecture consists of 16 execution units. Or am I wrong? (I just used warp as a general term, because I like it more Posted Image)


Nah there's 64 threads in a wavefront. In their latest architecture (GCN) the SIMDs are 16-wide, but they execute each instruction 4 times to complete it for the entire wavefront (so a single-cycle instruction actually takes 4 cycles to execute).


#5008337 SSAO and skybox artifact

Posted by MJP on 07 December 2012 - 08:45 PM

When dealing with shaders, ALL code is executed, including ALL branches, all function calls, etc. The ONLY exception for this is if something is known at compile time that will allow the compiler to remove a particular piece of code.

This is how all graphics cards work, AMD, NVIDIA, etc. So, your additional cost is of the if statement, and in your example, you are adding an extra if instruction. This is a zero cost on gpus. If you want to read on it, check out vectors processors and data hazards.

If you somehow split our shader up and added an if statement to the middle thinking that it would speed up your code, you would get NO speedup. because ALL paths will be executed.


This is completely wrong, even for relatively old GPU's (even the first-gen DX9 GPU's supported branching on shader constants, although in certain cases it was implemented through driver-level shenanigans). I'm not sure how you could even come to such a conclusion, considering it's really easy to set up a test case that shows otherwise.




PARTNERS