Jump to content
  • Advertisement
Sign in to follow this  
cozzie

DX11 ShaderPack in renderer design

This topic is 615 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi all,

I've been making nice progress in abstracting/ creating my new renderer.

At this point I'm adding ShaderPack's for rendering (a pack is a combination of a PS, VR, GS etc., and a sef of defines/macro's etc.

 

My question is if you'd say I'm on the right track of handling the following abstraction.

This is the idea I want to implement:

 

1. IShaderPack is the base shaderpack class

2. DX11ShaderPack is derived and contains DX11 API objects for the shaders

3. IShaderMgr is the base shader manager class

(where all IShaderPacks are stored, current shader index is stored, returns const ref to current IShaderPack etc.)

4. IShaderMgr will get a virtual 'SetShaderPack' function.

5. DX11ShaderMgr will implement this function and handle inputlayouts in the background, bind PS, VS etc. to the device etc.

6. My main renderer class will have a IShaderMgr object (if dx11 is used it will 'live' as a new *D3DX11ShaderMgr

7. Main renderer function will get a public 'SetShaderPack' method for the frontend, only taking a GUID/ handle (known in asset data).

This function will simply 'forward' to call to the IShaderMgr object in the renderer, containing this same method (SetShaderPack)

 

This way all implementation is hidden for the frontend and the main renderer class 'SetShaderPack' function only has to call the IShaderMgr->SetShaperPack function. Which can be completely different depending on the API, but this is then hidden from the renderer.

 

What do you think about this approach and/or how would you do it differently?

(btw, I've already implemented step 1 and 2, working like a charm)

Share this post


Link to post
Share on other sites
Advertisement

Sounds reasonable. I have a few notes, but they are pretty minor suggestions, and could mostly be left as 'later, if needed' changes.

 

- What does 'current shader index' mean? If you intend to thread things later, be careful about global state. If this is just internal to the renderer -- the current shader while recording its internal D3D11 cmd list -- that's fine.

 

- If you don't need rendering backends to be swappable at runtime, you can bypass virtuals. Your different shader backends can just each implement a single non-virtual SetShaderPack, and only the one that a given implementation needs gets compiled in, either by using ifdef or putting them in separate files and only compiling the one needed.

 

- Whether shader stages need to be tightly coupled is dependent on the backend and which stages are in use, it might be a good idea to do some kind of de-duplication (and avoid unnecessary platform BindShader calls) for shader packs that share some stages, for example you might use the same vertex shader with many different pixel shaders.

Edited by ShaneYCG

Share this post


Link to post
Share on other sites

At the low level you've got individual shader programs (or packs of programs as you call them), but usually at a higher level, engines will have some kind of larger / more abstract shader object.

e.g. FX/CgFX has 'Effects' (high level shader pack), which have 'Techniques' (for different purposes - transparent/forward opaque/deferred gbuffer/shadow map/etc), which have 'Passes' (your low level shader packs).

Or Unity has 'Shaders' with 'SubShaders' with 'Passes'.

 

In my engine I have Techniques (high level shader pack), which have Passes (for different purposes), which have Permutations (defines on/off).

The user of the API only binds a technique. The pass is chosen by a different object that also holds the render-target pointers. The user can also set "shader options", which are used to automatically select the appropriate permutation from a pass internally.

 

If you go with your low-level 'packs' instead of high level ones, then in my experience, at some point you'll have to build a high level shader system on top of it.

 

I personally prefer this system to be a part of the core rendering API rather than built on top of it because it often ends up cleaner for the user. In one engine that I've used which didn't do this cleanly, we ended up with routines that would: loop over every object, swap it's shader for a shadow map shader while remembering the original, render the shadow map pass, loop over every object restoring its original shader, render the gbuffer pass, etc...

 

 

I talked to you a bit on the chat about input layouts - there's two main approaches:

* hard code the way that vertex attributes are stored in memory. Each VS then is paired with one IL that maps the VS inputs to that storage structure.

* support multiple different vertex structures. Each VS is then paired with a collection of IL's, one for each vertex structure that's compatible with the VS inputs.

 

For a high level game renderer, the first option is perfectly fine. For a flexible rendering API that can be used to implement any kind of effect/pipeline, then the second one becomes more important.

Share this post


Link to post
Share on other sites

Thanks both.
In the meantime I've implemented most of this. The inputlayouts are a pain in the *ss though );

The universal struct for input Vertex attributes works fine and lets me reconstruct the d3d11 element descs easily, which helps with the abstraction.

Regarding the inputlayout ID I've found a solution too. I store a MD5 checksum of the input attributes in the IShaderMgr, which in case of dx11 is the checksum of the d3d11 element descs. This I can simply compare to the result of GetVSInputChecksum of the IShaderPack (which in case of DX11ShaderPack is also the checksum of the d3d11 element descs.

What's left is that I still need to pass the VS shaderblob to be able to create the inputlayout. But this is too dx11 specific. I've read one can create a dummy VS when creating the inputlayout, using the VS input I need. Do you have an example on how I could do that?

Last but not least, comparing 16 char strings to select the inputlayout sounds less efficient. Would it be possible to convert these strings to a uint somehow? (Without loosing the unique identification, ie abcdef shouldn't give the same value as fedcba)

Edited by cozzie

Share this post


Link to post
Share on other sites

which in case of dx11 is the checksum of the d3d11 element descs

 
One pitfall to look out for here -- the standard way to initialize a structure in C++ is with:

FOO_DESC desc = {}; //initialize to zero efficiently

desc.bar = 42;

But this won't necessarily zero out any padding bytes within the structure. If you're going to be hashing these structs, you need to ensure the padding bytes hold consistent values. So you have to use the heavyweight version:

FOO_DESC desc;

memset( &desc, 0, sizeof(desc) );SecureZeroMemory( &desc, sizeof(desc) );

desc.bar = 42;
 

I've read one can create a dummy VS when creating the inputlayout, using the VS input I need. Do you have an example on how I could do that?

For every particular set of vertex input attributes used by a shader, my toolchain spits out a dummy hlsl files like below, which is compiled into a dummy shader binary to be used during IL creation at runtime:

// Hash: A54D7D16
struct colored2LightmappedVertex
{
  float3 position : POSITION0;
  float4 color : COLOR0;
  float4 color2 : COLOR1;
  float2 texcoord : TEXCOORD0;
  float2 texcoord2 : TEXCOORD1;
  float3 normal : NORMAL0;
  float3 tangent : TANGENT0;
};
float4 vs_test_colored2LightmappedVertex( colored2LightmappedVertex inputs ) : SV_POSITION
{
  float4 hax = (float4)0;
  hax += (float4)(float)inputs.position; hax += (float4)(float)inputs.color; hax += (float4)(float)inputs.color2; hax += (float4)(float)inputs.texcoord; hax += (float4)(float)inputs.texcoord2; hax += (float4)(float)inputs.normal; hax += (float4)(float)inputs.tangent;  return hax;
}

The 'hax' variable just makes sure that the HLSL compiler doesn't optimize out any of the input variables.

 

As long as the actual VS used alongside the IL does actually have inputs that match the dummy shader, then everything works fine. If your dummy shader inputs and actual shader inputs differ, you get undefined behavior. 
 

comparing 16 char strings to select the inputlayout sounds less efficient. Would it be possible to convert these strings to a uint somehow?

Well 16 chars is 4 uints :)
You should use a different hash function than MD5. MD5 was originally designed to be a cryptographic hash designed to detect file tampering -- you don't need something that strong. I currently use FNV32a for most things like this, but check out this overview of many choices: http://aras-p.info/blog/2016/08/09/More-Hash-Function-Tests/

Share this post


Link to post
Share on other sites

Thanks, this helps a lot.

Maybe a stupid question, but do you just create a temporary ASCII HLSL file? (using C++ std IO libraries).

I assume you remove the file after the IL is created.

 

I'll also look into the hash functions. In the end I would hope to have something else then a string, because a string compare will always be slower then some number (Disclaimer: this is an assumption, not profiled :))

Share this post


Link to post
Share on other sites

Maybe a stupid question, but do you just create a temporary ASCII HLSL file? (using C++ std IO libraries). I assume you remove the file after the IL is created.

I do this as part of my toolchain, not the engine, so yeah I write a hlsl file to disc from the C# tool code and then launch an FXC.exe process to compile it into a bytecode file.
 
If you're doing this at runtime, there's likely no need to touch the disc. You can use the API to compile HLSL from memory, to an in-memory bytecode blob IIRC. You also can just use any of your real vertex shaders that happen to use the particular vertex structure that's appropriate. This fake VS code idea just lets you create all your IL's up front, without any dependencies on the shader system / shader loading.

Edited by Hodgman

Share this post


Link to post
Share on other sites

With some bumps and reviewing on the chat, I've managed to solve it.

The solution follows the following principles:

- the IShaderPack class does no longer have DX11 specific stuff in it

-- thus, the DX11ShaderManager, inherited from IShaderMgr contains IShaderPacks

- the IShaderPack no longer exposes a void* for the VS shader blob

-- instead is has 3 new generic const getters: for the VS filename, VS entrypoint and shadertarget

- both the DX11 elementDesc's and generic VtxAttributes are no longer stored in a class (just temporaries)

-- I even got rid of the generic VtxAttributes struct completely (because I don't need them for anything else yet)

 

To be able to achieve this, I've implemented some DX11 helper functions (pseudo code, because I'm at work :)):

std::vector<D3D11_INPUT_ELEMENTDESC> CreateDX11ElementDescs(ID3DBlob *pVsShaderBlob)
ID3D11InputLayout* CreateDX11InputLayout(const std::string &pVSFilename, const std::string &pVSEntryPoint, const std::string &pShaderTarget);
std::string CreateDX11VSInputChecksum(ID3DBlob *pVsShaderBlob);

What I basically do know:

 

- For each shader:

-- Compile and createshader

-- within scope directly create the VS input checksum (hash over the DX11 descs, which are created in the helper using the other helper).

This is easy/ convenient because here I already have the VS shaderblob around.

 

- Find unique inputlayouts by iterating over the checksums

 

- For each unique inputlayout

-- call helper CreateDX11InputLayout, i.e.

recompile shader using vsfilename, entrypoint and target

-- create temporary/ local scope element descs using the DX11 helper

-- Create DX11 IL using shaderblob and DX11 descs in local scope

 

I'm actually quite happy with how the solution grew to how it is now.

 

Another/ last change I'm gonna do here, is get rid of the mCurrentInputLayoutId within the IShaderPack, because it's too DX11'ish :) I plan to do this, by changing the SetShaderPack function of the manager, from: GetInputLayoutId from IShaderPack, to simply comparing the GetVSInputChecksum with the checksums that are stored in a vector within the Manager. This currently is a bit expensive (MD5 checksums), but in combination with moving the xxHash32, this should work fine (I think, because then it's no longer a string compera of 16 char's but a number comparison). 

 

@Hodgman/others: any last thoughts/ remarks?

 

(Ps.; next step is constant buffers :cool:)

Edited by cozzie

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!