Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 06 Dec 2003
Offline Last Active Today, 02:34 AM

#5165529 How to use external variables in HLSL?

Posted by AvengerDr on 08 July 2014 - 06:15 AM

If you have time to spare you could also go the totally overkill route as I did. I wrote a Shader Generation tool that allows me to define shaders as tree of "nodes". For example here is a Gaussian Blur shader.

DeclarationNode nColor = DeclarationNode.InitNode("color", Shaders.Type.Float4, 0, 0, 0, 0);
ArrayNode fOffsetWeight = new ArrayNode() {Input = fOffsetsAndWeights, Index = "i"};

AdditionNode nBlur = new AdditionNode
    PreCondition = new ForBlock()
        PreCondition = nColor,
        StartCondition = new ScalarNode(){Value = 0},
        EndCondition = new ScalarNode(){Value = 15}
    OpensBlock = true,
    Input1 = nColor,
    Input2 = new MultiplyNode()
        Input1 = new TextureSampleNode()
            Texture = tDiffuse,
            Sampler = sLinear,
            Coordinates = new AdditionNode()
                Input1 = new ReferenceNode() {Value = InputStruct[Param.SemanticVariables.Texture]},
                Input2 = new SwizzleNode(){Input = fOffsetWeight, Swizzle = new []{Swizzle.X, Swizzle.Y}},
        Input2 = new SwizzleNode() { Input = fOffsetWeight, Swizzle = new[] { Swizzle.Z } },
    ClosesBlock = true,
    IsVerbose = true,
    Declare = false,
    AssignToInput1 = true,
    Output = nColor.Output

OutputStruct = Struct.PixelShaderOutput;
Result = new PSOutputNode
    FinalColor = nBlur,
    Output = OutputStruct

And the resulting shader code (cbuffer declarations omitted)

PSOut GaussianBlurPS(VSOut input) : SV_Target
    float4 color = float4(0, 0, 0, 0);
    for (int i = 0; i < 15; i++)
        color += tDiffuseMap.Sample(sLinear, input.Texture + fOffsetsAndWeights[i].xy) * fOffsetsAndWeights[i].z;
    PSOut output;
    output.Color = color;
    return output;

The issue is that shaders need to be declared as combinations of node. There's no graphical editor as of yet (someday!). On the other hand this allows me to tailor it to the necessities of my engine. Since I can annotate every variable with the corresponding engine references. For example if a cbuffer requires a World matrix, the corresponding variable is tagged with that reference. So the shader initializer system when encountering that reference will automatically apply the correct data without needing to initialize a shader on a ad hoc basis.

Further, I can have "function" nodes such as one representing your "DoBlur" nodes (either as combination of other nodes or as plain text methods). When the shader is generated it will contain only the code strictly necessary (and nothing else) for that shader to work. Without messy includes and whatnot. But wait, there's more! The generated shaders are part of a "ShaderCollection" structure that holds various technique mappings. For example technique A might use SM4, while technique B might use SM5. So a ShaderCollection object can hold different versions of the same shader type, making the choice of the correct version a simpler one. AND it also comes with shader preview functionality (which is still experimental).


I just noticed that I forgot to add it to my GitHub repository but it will be when I go back home.

#5104358 Yet another shader generation approach

Posted by AvengerDr on 25 October 2013 - 08:27 AM

In the past few months I found myself juggling between different projects  aimed at several different platforms (Windows 7,8, RT and Phone). Some of those have different capabilities so some shaders needed to be modified in order to work correctly. I know that premature optimization is just as bad but in this specific situation I thought that addressing the problem sooner than later would be the right choice.
To address this problem I created a little tool that allows me to dynamically generate a set of shaders through a graph like structure. Which is nothing new, as it is usually the basis for this kind of application. I did probably reinvent a lot of wheels but since I couldn't use MS's shader designer (it only works with C++ I think) nor Unity's equivalent (as I have my own puny little engine) I decided to roll my own. I am writing here to get some feedback on my architecture and if there is something I overlooked.
Basically I have defined classes for most of the HLSL language. Then there are nodes such as constants, math operations and special function nodes. The latter are the most important ones as they correspond to high-level functions such as Phong Lighting, shadow algorithms and so on. Each of these function nodes expose several switches that enable me to enable/disable specific features. For example if I set a Phong node's "Shadows" to true then it will generate a different signature for the function than if it had it set to false. Once the structure is complete the graph is traversed and the actual shader code is generated line by line. From my understanding I think that dynamic shader linking works similarly but I've not been able to find a lot of information on the subject.
Right now shaders can only be defined in code, in the future I could build a graphical engine. A classic phong lighting pixel shader looks like this and this is the generated output. It is also possible to configure the amount of "verbosity". The interesting thing is that once the shader is compiled it gets serialized to a custom format that contains other information. Variables and nodes that are part of the shader are decorated with engine references. If I add a reference to the camera position for example, that variable tells the engine that it has to look for that value when initialising the shader. Analogously for the values needed to assemble constant buffers (like world/view/projection matrices). 

Once the shader is serialised, all this metadata helps the engine to automatically assign each shader variable or cbuffer with the right values. Before in my engine, each shader class had these huge parts of code that fetched needed values from somewhere else in the engine. Now all that has been deleted and is taken care automatically as long as the shaders are loaded in this format.

Another neat feature is that within the tool I have built I can define different techniques; i.e: a regular Phong shader, one using a Diffuse Map, one using a Shadow Map. Each technique maps a different combination of vertex and pixel shaders. The decoration that I mentioned earlier helps the tool generate a "TechniqueKey" for each shader that is then used by the engine to fetch the right shader from the file on disk. For example the PhongDiffusePS shader is decorated with attributes defining its use of a DiffuseMap (among other things). When in the actual application I enable the DiffuseMap feature, the shader system checks whether that feature is supported by the current set of shaders assigned the material. If a suitable technique is found, then the systeme enables the relevant parameters. In this way it is also possible to check for different feature levels and so on.

Probably something like this is overkill for a lot of small projects and I reckon it is not as easy to fix something in the generated code of this tool as it is when making changes in the actual source code it self. But once it does finally work, the the automatic configuration of shader variables is something that I like (at least if compared to my previous implementation, I don't how everyone else handles that). What I am asking is how extensible or expandable this approach is (maybe it is too late to ask this kind of questions biggrin.png)? Right now I have a handful of shaders defined in the system. If you had a look at the code, what kind of problems am I likely to run into when adding nodes to deal with Geometry shaders and other advanced features?

Finally, if anyone could be interested in having a look at the tool I'm happy to share it on GitHub.

#5077315 Should game objects render themselves, or should an object manager render them?

Posted by AvengerDr on 13 July 2013 - 04:56 AM

my "engine" uses different object representations: object are added to the world using a scene graph but that is not used for rendering as it would not be the most efficient way. Rather, after the scene is complete, a "SceneManager" examines the graph and computes the most efficient way to render it. As it has been said, objects are grouped according to materials, geometry used, rendering order and other properties. This scene manager returns a list of "commands" that the rendering loop executs. Commands can be of various types, i.e.: generate a shadow map, activate blending, render objects and so on. 


Another thing that I've been doing is separating the object class from the geometry class. In my engine, the object represents the high-level properties of a mesh such as its local position, rotation, etc. (local because the absolute values are obtained according to the scene graph). Whereas the geometry class contains the actual vertex/index buffers. There is only one geometry instance for each unique 3D object in the world.


This helps further improve the rendering efficiency. After each object has been grouped into materials then I further group each one of these objects according to the geometry used. Then for each Material/Geometry couple I issuse a "Render Command" to render all the objects that use the same materials and reference geometry. This way there will be only one setVB/IB command per group. This also helps with hardware instancing: if a material supports it, then I just use the list of object instances to compute an instance buffer.

#5075673 DX11 - Instancing - The choice of method

Posted by AvengerDr on 06 July 2013 - 03:57 AM

In general you take a single geometry object (i.e. a vertex buffer and possibly an index buffer) and replicate it as needed. Where and how many to replicate is determined by the instance buffer. Here you will put the world matrix (for example) of each bullet in the world. In your shader you will multiply your model's vertices with the particular instance's world matrix to determine where in the world that particular instance is. It's like you were using the geometry you wanted to instance as a stamp, then the instance buffer would contain the locations where you need to stamp :) (and other properties, like the colour for example)


If the number of instances changes, one approach is to recompute all of it and re-bind it. If your geometry is mostly static, then another approach is to a create a "large enough" buffer. You can bind it even though it could not be full. When the need arises, you simply add more entries to the buffer, without having to recreate the other ones. This also sets some sort of limit for your rendering engine as it's like saying that you can render up to X instances. 

In the specific case of spaceships though, I think you'd need to update them each frame. BUT! If you go for the hard science approach, then you don't need to worry about any of this at all, beam weapons are supposed to travel at the speed of light so .. problem solved!

#5026527 Real time + pause Vs Turn Based for a Space tactical game on smartphones

Posted by AvengerDr on 28 January 2013 - 03:31 PM

Well casual gamers are obviously out of question. Ideally anyone who's not going to be put off from the game's lack of awesome 3D graphics.


If I had the resources, I would totally go for something like Homeworld. As things stand now, the prospect of having a professional artist create the graphics is outside my reach at the moment. That's why I'd like to stick to a minimalistic approach, similar to Introversion's Defcon game. First, it's relatively easy to draw iconic symbols for ships and the like, second it supports the idea of the player being in some sort of "situation room" rather than in the fighter's cockpit.

#4861723 [SlimDX] Compatibility with NVIDIA 3D Vision?

Posted by AvengerDr on 14 September 2011 - 02:46 PM

I've been experimenting with 3D Vision myself. If you're going to use 3D VIsion automatic you simply need to hit CTRL+T and your glasses. There's a way of doing it automatically if you hook up some methods from NVAPI, but I've not yet attempted this as currently there's no managed port and it only supports DX10.

Each directx app can theoretically be supported by 3D vision automatic. 3D Vision automatic is not "true" stereoscopy: the driver itself will take care of duplicating the render calls from two different point of view. There's a hack that lets you control the stereoization process. The "NV_STEREO_IMAGE_SIGNATURE" method which consists in combining the eye-specific images into a texture twice the width and writing a special value in the last row and then presenting the results.This special value is picked up by the driver and it starts synchronizing this texture with the glases. I've tested it with DX9 and DX10 and it definitely works.

#4860409 [SlimDX] [DX10/11] NV_STEREO_IMAGE_SIGNATURE and DirectX10/11

Posted by AvengerDr on 11 September 2011 - 12:42 PM

I finally got it to work!
Apparently rendering the stereoized texture to a quad and then copying the results over to the backbuffer was the wrong approach. What needs to be done to emulate DX9's stretchrect is the following:

[source lang="csharp"]MessagePump.Run(form, () => { device.ClearRenderTargetView(renderView, Color.Cyan); ResourceRegion stereoSrcBox = new ResourceRegion { Front = 0, Back = 1, Top = 0, Bottom = size.Height, Left = 0, Right = size.Width }; device.CopySubresourceRegion(stereoizedTexture, 0, stereoSrcBox, backBuffer, 0, 0, 0, 0); swapChain.Present(0, PresentFlags.None); });[/source]

The help states that ResourceRegion is "The source region". So I thought that if I specified 1920x1080 instead of 3840 * 1081 it would only grab the leftmost part of the image instead of the whole picture. I'm not entirely sure why it is working.. but at least it definitely is.. I tried to erase one part of the image and if you block one eye you will see a blank screen.

For those of you who are experimenting with nVidia's default stereo pictures, the eyes are swapped so remember to also use the swap eyes flag 0x00000001.
If anyone else needs some help, I'd be happy to provide it!

#4860291 [SlimDX] [DX10/11] NV_STEREO_IMAGE_SIGNATURE and DirectX10/11

Posted by AvengerDr on 11 September 2011 - 04:54 AM

Hi there,
a while ago I posted a thread asking for advice on how to control the stereoization process on the 3D Vision kit by nVidia. As some of you may be aware there is a low-level hack in which you render the left and right eye images manually, then you write a special value in the last row. Upon rendering this new scene, the nVidia driver picks the value up and activates the shutter glasses. I've been able to test it under DX9 so I know that it definitely can work but due to some methods not being available anymore in DX10/11 I've been unable to make it work.

The algorithm goes like this:
  • Render left eye image
  • Render right eye image
  • Create a texture able to contain them both PLUS an extra row (so the texture size would be 2 * width, height + 1)
  • Write this NV_STEREO_IMAGE_SIGNATURE value
  • Render this texture on the screen
My test code skips the first two steps, as I already have a stereo texture. It was a former .JPS file, specifically one of those included in the sample pictures coming with the nvidia 3D kit. Step number 5 uses a full screen quad and a shader to render the stereoized texture onto it through an ortho-projection matrix. The sample code I've seen for DX9 doesn't need this and simply calls the StretchRect(...) method to copy the texture back onto the backbuffer. So maybe it is for this reason that is not working? Is there a similar method to accomplish this in DX10? I thought that rendering onto the backbuffer would theoretically be the same than copying (or StretchRecting) a texture onto it, but maybe it is not?

Here follows my code:
Stereoization procedure
[source lang="csharp"]static Texture2D Make3D(Texture2D stereoTexture){ // stereoTexture contains a stereo image with the left eye image on the left half // and the right eye image on the right half // this staging texture will have an extra row to contain the stereo signature Texture2DDescription stagingDesc = new Texture2DDescription() { ArraySize = 1, Width = 3840, Height = 1081, BindFlags = BindFlags.None, CpuAccessFlags = CpuAccessFlags.Write, Format = SlimDX.DXGI.Format.R8G8B8A8_UNorm, OptionFlags = ResourceOptionFlags.None, Usage = ResourceUsage.Staging, MipLevels = 1, SampleDescription = new SampleDescription(1, 0) }; Texture2D staging = new Texture2D(device, stagingDesc); // Identify the source texture region to copy (all of it) ResourceRegion stereoSrcBox = new ResourceRegion { Front = 0, Back = 1, Top = 0, Bottom = 1080, Left = 0, Right = 3840 }; // Copy it to the staging texture device.CopySubresourceRegion(stereoTexture, 0, stereoSrcBox, staging, 0, 0, 0, 0); // Open the staging texture for reading DataRectangle box = staging.Map(0, MapMode.Write, SlimDX.Direct3D10.MapFlags.None); // Go to the last row box.Data.Seek(stereoTexture.Description.Width * stereoTexture.Description.Height * 4, System.IO.SeekOrigin.Begin); // Write the NVSTEREO header box.Data.Write(data, 0, data.Length); staging.Unmap(0); // Create the final stereoized texture Texture2DDescription finalDesc = new Texture2DDescription() { ArraySize = 1, Width = 3840, Height = 1081, BindFlags = BindFlags.ShaderResource, CpuAccessFlags = CpuAccessFlags.Write, Format = SlimDX.DXGI.Format.R8G8B8A8_UNorm, OptionFlags = ResourceOptionFlags.None, Usage = ResourceUsage.Dynamic, MipLevels = 1, SampleDescription = new SampleDescription(1, 0) }; // Copy the staging texture on a new texture to be used as a shader resource Texture2D final = new Texture2D(device, finalDesc); device.CopyResource(staging, final); staging.Dispose(); return final;}[/source]
[source lang="csharp"] // The NVSTEREO header. static byte[] data = new byte[] {0x4e, 0x56, 0x33, 0x44, //NVSTEREO_IMAGE_SIGNATURE = 0x4433564e; 0x00, 0x0F, 0x00, 0x00, //Screen width * 2 = 1920*2 = 3840 = 0x00000F00; 0x38, 0x04, 0x00, 0x00, //Screen height = 1080 = 0x00000438; 0x20, 0x00, 0x00, 0x00, //dwBPP = 32 = 0x00000020; 0x02, 0x00, 0x00, 0x00}; //dwFlags = SIH_SCALE_TO_FIT = 0x00000002[/source]
[source lang="csharp"]private static Device device;[STAThread]static void Main(){ // Device creation var form = new RenderForm("Stereo test") {ClientSize = new Size(1920, 1080)}; var desc = new SwapChainDescription() { BufferCount = 1, ModeDescription = new ModeDescription(1920, 1080, new Rational(120, 1), Format.R8G8B8A8_UNorm), IsWindowed = true, OutputHandle = form.Handle, SampleDescription = new SampleDescription(1, 0), SwapEffect = SwapEffect.Discard, Usage = Usage.RenderTargetOutput }; SwapChain swapChain; Device.CreateWithSwapChain(null, DriverType.Hardware, DeviceCreationFlags.Debug, desc, out device, out swapChain); //Stops Alt+enter from causing fullscreen skrewiness. Factory factory = swapChain.GetParent<Factory>(); factory.SetWindowAssociation(form.Handle, WindowAssociationFlags.IgnoreAll); Texture2D backBuffer = Resource.FromSwapChain<Texture2D>(swapChain, 0); RenderTargetView renderView = new RenderTargetView(device, backBuffer); ImageLoadInformation info = new ImageLoadInformation() { BindFlags = BindFlags.None, CpuAccessFlags = CpuAccessFlags.Read, FilterFlags = FilterFlags.None, Format = SlimDX.DXGI.Format.R8G8B8A8_UNorm, MipFilterFlags = FilterFlags.None, OptionFlags = ResourceOptionFlags.None, Usage = ResourceUsage.Staging, MipLevels = 1 }; // Make texture 3D Texture2D sourceTexture = Texture2D.FromFile(device, "medusa.jpg", info); Texture2D stereoizedTexture = Make3D(sourceTexture); ShaderResourceView srv = new ShaderResourceView(device, stereoizedTexture); // Create a quad that fills the whole screen ushort[] idx; TexturedVertex[] quad = CreateTexturedQuad(Vector3.Zero, 1920, 1080, out idx); // fill vertex and index buffers DataStream stream = new DataStream(4*24, true, true); stream.WriteRange(quad); stream.Position = 0; Buffer vertices = new SlimDX.Direct3D10.Buffer(device, stream, new BufferDescription() { BindFlags = BindFlags.VertexBuffer, CpuAccessFlags = CpuAccessFlags.None, OptionFlags = ResourceOptionFlags.None, SizeInBytes = 4*24, Usage = ResourceUsage.Default }); stream.Close(); stream = new DataStream(6*sizeof (ushort), true, true); stream.WriteRange(idx); stream.Position = 0; Buffer indices = new SlimDX.Direct3D10.Buffer(device, stream, new BufferDescription() { BindFlags = BindFlags.IndexBuffer, CpuAccessFlags = CpuAccessFlags.None, OptionFlags = ResourceOptionFlags.None, SizeInBytes = 6*sizeof (ushort), Usage = ResourceUsage.Default }); // Create world view (ortho) projection matrices QuaternionCam qCam = new QuaternionCam(); // Load effect from file. It is a basic effect that renders a full screen quad through // an ortho projectio=n matrix Effect effect = Effect.FromFile(device, "Texture.fx", "fx_4_0", ShaderFlags.Debug, EffectFlags.None); EffectTechnique technique = effect.GetTechniqueByIndex(0); EffectPass pass = technique.GetPassByIndex(0); InputLayout layout = new InputLayout(device, pass.Description.Signature, new[] { new InputElement("POSITION", 0, Format. R32G32B32A32_Float, 0, 0), new InputElement("TEXCOORD", 0, Format. R32G32_Float, 16, 0) }); effect.GetVariableByName("mWorld").AsMatrix().SetMatrix( Matrix.Translation(Layout.OrthographicTransform(Vector2.Zero, 90, new Size(1920, 1080)))); effect.GetVariableByName("mView").AsMatrix().SetMatrix(qCam.View); effect.GetVariableByName("mProjection").AsMatrix().SetMatrix(qCam.OrthoProjection); effect.GetVariableByName("tDiffuse").AsResource().SetResource(srv); // Set RT and Viewports device.OutputMerger.SetTargets(renderView); device.Rasterizer.SetViewports(new Viewport(0, 0, form.ClientSize.Width, form.ClientSize.Height, 0.0f, 1.0f)); // Create solid rasterizer state RasterizerStateDescription rDesc = new RasterizerStateDescription() { CullMode = CullMode.None, IsDepthClipEnabled = true, FillMode = FillMode.Solid, IsAntialiasedLineEnabled = true, IsFrontCounterclockwise = true, IsMultisampleEnabled = true }; RasterizerState rState = RasterizerState.FromDescription(device, rDesc); device.Rasterizer.State = rState; // Main Loop MessagePump.Run(form, () => { device.ClearRenderTargetView(renderView, Color.Cyan); device.InputAssembler.SetInputLayout(layout); device.InputAssembler.SetPrimitiveTopology(PrimitiveTopology.TriangleList); device.InputAssembler.SetVertexBuffers(0, new VertexBufferBinding(vertices, 24, 0)); device.InputAssembler.SetIndexBuffer(indices, Format.R16_UInt, 0); for (int i = 0; i < technique.Description.PassCount; ++i) { // Render the full screen quad pass.Apply(); device.DrawIndexed(6, 0, 0); } swapChain.Present(0, PresentFlags.None); }); // Dispose resources vertices.Dispose(); layout.Dispose(); effect.Dispose(); renderView.Dispose(); backBuffer.Dispose(); device.Dispose(); swapChain.Dispose(); rState.Dispose(); stereoizedTexture.Dispose(); sourceTexture.Dispose(); indices.Dispose(); srv.Dispose();}[/source]

Thanks in advance!

#4026326 SlimDX -- A Prototype MDX Replacement Library

Posted by AvengerDr on 05 August 2007 - 01:16 PM

Since MDX has been given the final nail on the coffin I'm considering switching to SlimDX. Did you do some benchmarks? Comparing FPS of code generated with MDX to code generate to SlimDX?