Search the Community

Showing results for tags 'DX11' in content posted in Graphics and GPU Programming.



More search options

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Categories

  • Audio
    • Music and Sound FX
  • Business
    • Business and Law
    • Career Development
    • Production and Management
  • Game Design
    • Game Design and Theory
    • Writing for Games
    • UX for Games
  • Industry
    • Interviews
    • Event Coverage
  • Programming
    • Artificial Intelligence
    • General and Gameplay Programming
    • Graphics and GPU Programming
    • Engines and Middleware
    • Math and Physics
    • Networking and Multiplayer
  • Visual Arts
  • Archive

Categories

  • News

Categories

  • Audio
  • Visual Arts
  • Programming
  • Writing

Categories

  • Audio Jobs
  • Business Jobs
  • Game Design Jobs
  • Programming Jobs
  • Visual Arts Jobs

Categories

  • GameDev Unboxed

Forums

  • Audio
    • Music and Sound FX
  • Business
    • Games Career Development
    • Production and Management
    • Games Business and Law
  • Game Design
    • Game Design and Theory
    • Writing for Games
  • Programming
    • Artificial Intelligence
    • Engines and Middleware
    • General and Gameplay Programming
    • Graphics and GPU Programming
    • Math and Physics
    • Networking and Multiplayer
  • Visual Arts
    • 2D and 3D Art
    • Critique and Feedback
  • Topical
    • Virtual and Augmented Reality
    • News
  • Community
    • GameDev Challenges
    • For Beginners
    • GDNet+ Member Forum
    • GDNet Lounge
    • GDNet Comments, Suggestions, and Ideas
    • Coding Horrors
    • Your Announcements
    • Hobby Project Classifieds
    • Indie Showcase
    • Article Writing
  • Affiliates
    • NeHe Productions
    • AngelCode
  • Workshops
    • C# Workshop
    • CPP Workshop
    • Freehand Drawing Workshop
    • Hands-On Interactive Game Development
    • SICP Workshop
    • XNA 4.0 Workshop
  • Archive
    • Topical
    • Affiliates
    • Contests
    • Technical

Calendars

  • Community Calendar
  • Games Industry Events
  • Game Jams

Blogs

There are no results to display.

There are no results to display.

Marker Groups

  • Members

Developers


Group


About Me


Website


Industry Role


Twitter


Github


Twitch


Steam

Found 1361 results

  1. DX11 FFT on GPU

    Hi, I'm currently trying to write a shader which shoud compute a fast fourier transform of some data, manipulating the transformed data, do an inverse FFT an then displaying the result as vertex offset and color. I use Unity3d and HLSL as shader language. One of the main problems is that the data should not be passed from CPU to GPU for every frame if possible. My original plan was to use a vertex shader and do the fft there, but I fail to find out how to store changing data betwen shader calls/passes. I found a technique called ping-ponging which seems to be based on writing and exchangeing render targets, but I couldn't find an example for HLSL as a vertex shader yet. I found https://social.msdn.microsoft.com/Forums/en-US/c79a3701-d028-41d9-ad74-a2b3b3958383/how-to-render-to-multiple-render-targets-in-hlsl?forum=xnaframework which seem to use COLOR0 and COLOR1 as such render targets. Is it even possible to do such calculations on the gpu only? (/in this shader stage?, because I need the result of the calculation to modify the vertex offsets there) I also saw the use of compute shaders in simmilar projects (ocean wave simulation), do they realy copy data between CPU / GPU for every frame? How does this ping-ponging / rendertarget switching technique work in HLSL? Have you seen an example of usage? Any answer would be helpfull. Thank you appswert
  2. Hi Just a simple question about compute shaders (CS5, DX11). Do the atomic operations (InterlockedAdd in my case) should work without any issues on RWByteAddressBuffer and be globaly coherent ? I'v come back from CUDA world and commited fairly simple kernel that does some job, the pseudo-code is as follows: (both kernels use that same RWByteAddressBuffer) first kernel does some job and sets Result[0] = 0; (using Result.Store(0, 0)) I'v checked with debugger, and indeed the value stored at dword 0 is 0 now my second kernel RWByteAddressBuffer Result; [numthreads(8, 8, 8)] void main() { for (int i = 0; i < 5; i++) { uint4 v0 = DoSomeCalculations1(); uint4 v1 = DoSomeCalculations2(); uint4 v2 = DoSomeCalculations3(); if (v0.w == 0 && v1.w == 0 && v2.w) continue; // increment counter by 3, and get it previous value // this should basically allocate space for 3 uint4 values in buffer uint prev; Result.InterlockedAdd(0, 3, prev); // this fills the buffer with 3 uint4 values (+1 is here as the first 16 bytes is occupied by DrawInstancedIndirect data) Result.Store4((prev+0+1)*16, v0); Result.Store4((prev+1+1)*16, v1); Result.Store4((prev+2+1)*16, v2); } } Now I invoke it with Dispatch(4,4,4) Now I use DrawInstancedIndirect to draw the buffer, but ocassionaly there is missed triangle here and there for a frame, as if the atomic counter does not work as expected do I need any additional synchronization there ? I'v tried 'AllMemoryBarrierWithGroupSync' at the end of kernel, but without effect. If I do not use atomic counter, and istead just output empty vertices (that will transform into degenerated triangles) the all is OK - as if I'm missing some form of synchronization, but I do not see such a thing in DX11. I'v tested on both old and new nvidia hardware (680M and 1080, the behaviour is that same).
  3. Hello, I am, like many others before me, making a displacement map tesselator. I want render some terrain using a quad, a texture containing heightdata and the geometryshader/tesselator. So far, Ive managed the utilize the texture on the pixelshader (I return different colors depending on the height). I have also managed to tesselate my surface, i.e. subdivided my quad into lots of triangles . What doesnt work however is the sampling step on the domain shader. I want to offset the vertices using the heightmap. I tried calling the same function "textureMap.Sample(textureSampler, texcoord)" as on the pixelshader but got compiling errors. Instead I am now using the "SampleLevel" function to use the 0 mipmap version of the input texture. But yeah non of this seem to be working. I wont get anything except [0, 0, 0, 0] from my sampler. Below is some code: The working pixelshader, the broken domain shader where I want to sample, and the instanciations of the samplerstates on the CPU side. Been stuck on this for a while! Any help would be much appreciated! Texture2D textureMap: register(t0); SamplerState textureSampler : register(s0); //Pixel shader float4 PS(PS_IN input) : SV_TARGET { float4 textureColor = textureMap.Sample(textureSampler, input.texcoord); return textureColor; } GS_IN DS(HS_CONSTANT_DATA input, float3 uvwCoord : SV_DomainLocation, const OutputPatch<DS_IN, 3> patch) { GS_IN output; float2 texcoord = uvwCoord.x * patch[0].texcoord.xy + uvwCoord.y * patch[1].texcoord.xy + uvwCoord.z * patch[2].texcoord.xy; float4 textureColor = textureMap.SampleLevel(textureSampler, texcoord.xy, 0); //fill and return output.... } //Sampler SharpDX.Direct3D11.SamplerStateDescription samplerDescription; samplerDescription = SharpDX.Direct3D11.SamplerStateDescription.Default(); samplerDescription.Filter = SharpDX.Direct3D11.Filter.MinMagMipLinear; samplerDescription.AddressU = SharpDX.Direct3D11.TextureAddressMode.Wrap; samplerDescription.AddressV = SharpDX.Direct3D11.TextureAddressMode.Wrap; this.samplerStateTextures = new SharpDX.Direct3D11.SamplerState(d3dDevice, samplerDescription); d3dDeviceContext.PixelShader.SetSampler(0, samplerStateTextures); d3dDeviceContext.VertexShader.SetSampler(0, samplerStateTextures); d3dDeviceContext.HullShader.SetSampler(0, samplerStateTextures); d3dDeviceContext.DomainShader.SetSampler(0, samplerStateTextures); d3dDeviceContext.GeometryShader.SetSampler(0, samplerStateTextures);
  4. I just finished up my 1st iteration of my sprite renderer and I'm sort of questioning its performance. Currently, I am trying to render 10K worth of 64x64 textured sprites in a 800x600 window. These sprites all using the same texture, vertex shader, and pixel shader. There is basically no state changes. The sprite renderer itself is dynamic using the D3D11_MAP_WRITE_NO_OVERWRITE then D3D11_MAP_WRITE_DISCARD when the vertex buffer is full. The buffer is large enough to hold all 10K sprites and execute them in a single draw call. Cutting the buffer size down to only being able to fit 1000 sprites before a draw call is executed does not seem to matter / improve performance. When I clock the time it takes to complete the render method for my sprite renderer (the only renderer that is running) I'm getting about 40ms. Aside from trying to adjust the size of the vertex buffer, I have tried using 1x1 texture and making the window smaller (640x480) as quick and dirty check to see if the GPU was the bottleneck, but I still get 40ms with both of those cases. I'm kind of at a loss. What are some of the ways that I could figure out where my bottleneck is? I feel like only being able to render 10K sprites is really low, but I'm not sure. I'm not sure if I coded a poor renderer and there is a bottleneck somewhere or I'm being limited by my hardware Just some other info: Dev PC specs: GPU: Intel HD Graphics 4600 / Nvidia GTX 850M (Nvidia is set to be the preferred GPU in the Nvida control panel. Vsync is set to off) CPU: Intel Core i7-4710HQ @ 2.5GHz Renderer: //The renderer has a working depth buffer //Sprites have matrices that are precomputed. These pretransformed vertices are placed into the buffer Matrix4 model = sprite->getModelMatrix(); verts[0].position = model * verts[0].position; verts[1].position = model * verts[1].position; verts[2].position = model * verts[2].position; verts[3].position = model * verts[3].position; verts[4].position = model * verts[4].position; verts[5].position = model * verts[5].position; //Vertex buffer is flaged for dynamic use vertexBuffer = BufferModule::createVertexBuffer(D3D11_USAGE_DYNAMIC, D3D11_CPU_ACCESS_WRITE, sizeof(SpriteVertex) * MAX_VERTEX_COUNT_FOR_BUFFER); //The vertex buffer is mapped to when adding a sprite to the buffer //vertexBufferMapType could be D3D11_MAP_WRITE_NO_OVERWRITE or D3D11_MAP_WRITE_DISCARD depending on the data already in the vertex buffer D3D11_MAPPED_SUBRESOURCE resource = vertexBuffer->map(vertexBufferMapType); memcpy(((SpriteVertex*)resource.pData) + vertexCountInBuffer, verts, BYTES_PER_SPRITE); vertexBuffer->unmap(); //The constant buffer used for the MVP matrix is updated once per draw call D3D11_MAPPED_SUBRESOURCE resource = mvpConstBuffer->map(D3D11_MAP_WRITE_DISCARD); memcpy(resource.pData, projectionMatrix.getData(), sizeof(Matrix4)); mvpConstBuffer->unmap(); Vertex / Pixel Shader: cbuffer mvpBuffer : register(b0) { matrix mvp; } struct VertexInput { float4 position : POSITION; float2 texCoords : TEXCOORD0; float4 color : COLOR; }; struct PixelInput { float4 position : SV_POSITION; float2 texCoords : TEXCOORD0; float4 color : COLOR; }; PixelInput VSMain(VertexInput input) { input.position.w = 1.0f; PixelInput output; output.position = mul(mvp, input.position); output.texCoords = input.texCoords; output.color = input.color; return output; } Texture2D shaderTexture; SamplerState samplerType; float4 PSMain(PixelInput input) : SV_TARGET { float4 textureColor = shaderTexture.Sample(samplerType, input.texCoords); return textureColor; } If anymore info is needed feel free to ask, I would really like to know how I can improve this assuming I'm not hardware limited
  5. Hi everybody! I am currently trying to write my own GPU Raytracer. I am using DirectX 11 and Compute Shader. Here is what I've tried so far: RayTracer.hlsl RayTracingHeader.hlsli But the result is not what I expected. For example, the sphere is located at (0,0,10) with radius 1, but this is the result when CamPos is 4.5, which I think is wrong. Also, for some reason, when I rotate the camera, the sphere expands. Anyone could give me some pieces of advice, please?
  6. I am currently working on my first iteration of my sprite renderer and I'm trying to draw 2 sprites. They both use the same texture and are placed into the same buffer, but unfortunately only the second sprite is shown on the the screen. I assume I messed something up when I place them into the buffer and that I am overwriting the data of the first sprite. So how should I be mapping my buffer with an offset? /* Code that sets up the sprite vertices and etc */ D3D11_MAPPED_SUBRESOURCE resource = vertexBuffer->map(vertexBufferMapType); memcpy(resource.pData, verts, sizeof(SpriteVertex) * VERTEX_PER_QUAD); vertexBuffer->unmap(); vertexCount += VERTEX_PER_QUAD; I feel like I should be doing something like: /* Code that sets up the sprite vertices and etc */ D3D11_MAPPED_SUBRESOURCE resource = vertexBuffer->map(vertexBufferMapType); //Place the sprite vertex data into the pData using the current vertex count as offset //The code resource.pData[vertexCount] is syntatically wrong though :( Not sure how it should look since pData is void pointer memcpy(resource.pData[vertexCount], verts, sizeof(SpriteVertex) * VERTEX_PER_QUAD); vertexBuffer->unmap(); vertexCount += VERTEX_PER_QUAD; Also speaking of offsets can someone give an example of when the pOffsets param for the IASetVertexBuffers call would not be 0
  7. While considering how to optimize my DirectX11 graphics engine, I noticed that it is mapping and unmapping (locking and unlocking) the D3D11_MAPPED_SUBRESOURCE many times to write to different constant buffers. Some shader have 10 or more contant buffers, for camera position, light direction, clip plane, texture translation, fog info, and many other things that need to be passed from the CPU to GPU. I was wondering if all the mapping and unmapping might be the reason why my engine is running horribly slow, and is there any way around this? What is the correct way to do it? (Refer to LightShaderClass::SetShaderParameters() function, line 401 onward to see all the mapping/unmapping). https://github.com/mister51213/DirectX11Engine/blob/WaterShader/DirectX11Engine/LightShaderClass.cpp I feel like I might be doing something obviously wrong and wasteful that could be fixed with a simple reorganization, but dont know enough about DX11 to know how. Any tips would be much appreciated, thanks.
  8. Hello! I can see that when there's a write to UAVs in CS or PS, and I bind a null ID3D11UnorderedAccessView into a used UAV slot, the GPU won't hang and the writes are silently dropped. I hope I amn't dreaming. With DX12, I can't seem to emulate this. I reckon it's impossible. The shader just reads the descriptor of the UAV (from a register/offset based on the root signature layout) and does an "image_store" at some offset from the base address. If it's unmapped, bang, we're dead. I tried zeroing out that GPU visible UAV's range in the table, same result. Such an all-zero UAV descriptor doesn't seem very legit. That's expected. Am I right? How does DX11 do it that it survives this? Does it silently patch the shader or what? Thanks, .P
  9. I'm reviewing a tutorial on using textures and I see that the vertex shader has this input declaration where the position is float4 struct VertexInputType { float4 position : POSITION; float2 tex : TEXCOORD0; }; But when they go over uploading the data to vertex buffer they only use a float3 (Vector3) value for the position // Load the vertex array with data. vertices[0].position = D3DXVECTOR3(-1.0f, -1.0f, 0.0f); // Bottom left. vertices[0].texture = D3DXVECTOR2(0.0f, 1.0f); vertices[1].position = D3DXVECTOR3(0.0f, 1.0f, 0.0f); // Top middle. vertices[1].texture = D3DXVECTOR2(0.5f, 0.0f); vertices[2].position = D3DXVECTOR3(1.0f, -1.0f, 0.0f); // Bottom right. vertices[2].texture = D3DXVECTOR2(1.0f, 1.0f); The input layout description declared also seems to match to use a float3 value polygonLayout[0].SemanticName = "POSITION"; polygonLayout[0].SemanticIndex = 0; polygonLayout[0].Format = DXGI_FORMAT_R32G32B32_FLOAT; polygonLayout[0].InputSlot = 0; polygonLayout[0].AlignedByteOffset = 0; polygonLayout[0].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA; polygonLayout[0].InstanceDataStepRate = 0; So does this mean that shaders will automatically default "missing" values to 0 or something of the like? If so is this frowned upon?
  10. Hi guys, following Rastertek tutorial and making a reflection effect using render targets in DirectX11. http://www.rastertek.com/dx11tut27.html Got it working, but as you can see, the reflected image seems to be drawing to a really small "texture" - doesnt correspond to the size of the actual blue floor texture it is being drawn onto. If you go really close to it and look straight down, you can also see many copies of the reflected texture inside the blue floor texture, so it seems it's being tiled somehow. But no idea where to change this setting in the code, or what is the cause of the sizing problem. At line 108 of Graphics.cpp, _RenderTexture->Initialize(_D3D->GetDevice(), screenWidth, screenHeight); We give the screenWidth and Height to the render Texture that is later used as the target for the reflected image. Howver, I tried passing in different values for width and height of this texture and it will stretch or shrink the reflected image, but then its position in the reflected surface is all messed up. I need the size of that reflection target to be the same size as the actual blue floor texture, but the images to be scaled properly when they get reflected in that. It doesnt seem to be doing the scaling right at the moment. If you could take a look, any help would be much appreciated. Thanks so much. https://github.com/mister51213/DirectX11Engine/blob/master/DirectX11Engine/Graphics.cpp
  11. A player of my game contacted me, because the game crashes during start-up. After taking a look into log file he sent me, calling CreateSwapChain results in an exception as shown below. HRESULT: [0x887A0001], Module: [SharpDX.DXGI], ApiCode: [DXGI_ERROR_INVALID_CALL/InvalidCall], Message: Unknown at SharpDX.Result.CheckError() at SharpDX.DXGI.Factory.CreateSwapChain(ComObject deviceRef, SwapChainDescription& descRef, SwapChain swapChainOut) at SharpDX.Direct3D11.Device.CreateWithSwapChain(Adapter adapter, DriverType driverType, DeviceCreationFlags flags, FeatureLevel[] featureLevels, SwapChainDescription swapChainDescription, Device& device, SwapChain& swapChain) at SharpDX.Direct3D11.Device.CreateWithSwapChain(DriverType driverType, DeviceCreationFlags flags, SwapChainDescription swapChainDescription, Device& device, SwapChain& swapChain) In order to investigate this player's problem, I created a test application that looks like this: class Program { static void Main(string[] args) { Helper.UserInfo userInfo = new Helper.UserInfo(true); Console.WriteLine("Checking adapters."); using (var factory = new SharpDX.DXGI.Factory1()) { for (int i = 0; i < factory.GetAdapterCount(); i++) { SharpDX.DXGI.Adapter adapter = factory.GetAdapter(i); Console.WriteLine("\tAdapter {0}: {1}", i, adapter.Description.Description); bool supportsLevel10_1 = SharpDX.Direct3D11.Device.IsSupportedFeatureLevel(adapter, SharpDX.Direct3D.FeatureLevel.Level_10_1); Console.WriteLine("\t\tSupport for Level_10_1? {0}!", supportsLevel10_1); Console.WriteLine("\t\tCreate refresh rate (60)."); var refreshRate = new SharpDX.DXGI.Rational(60, 1); Console.WriteLine("\t\tCreate mode description."); var modeDescription = new SharpDX.DXGI.ModeDescription(0, 0, refreshRate, SharpDX.DXGI.Format.R8G8B8A8_UNorm); Console.WriteLine("\t\tCreate sample description."); var sampleDescription = new SharpDX.DXGI.SampleDescription(1, 0); Console.WriteLine("\t\tCreate swap chain description."); var desc = new SharpDX.DXGI.SwapChainDescription() { // Numbers of back buffers to use on the SwapChain BufferCount = 1, ModeDescription = modeDescription, // Do we want to use a windowed mode? IsWindowed = true, Flags = SharpDX.DXGI.SwapChainFlags.None, OutputHandle = Process.GetCurrentProcess().MainWindowHandle, // Cout in 'SampleDescription' means the level of anti-aliasing (from 1 to usually 4) SampleDescription = sampleDescription, SwapEffect = SharpDX.DXGI.SwapEffect.Discard, // DXGI_USAGE_RENDER_TARGET_OUTPUT: This value is used when you wish to draw graphics into the back buffer. Usage = SharpDX.DXGI.Usage.RenderTargetOutput }; try { Console.WriteLine("\t\tCreate device (Run 1)."); SharpDX.Direct3D11.Device device = new SharpDX.Direct3D11.Device(adapter, SharpDX.Direct3D11.DeviceCreationFlags.None, new SharpDX.Direct3D.FeatureLevel[] { SharpDX.Direct3D.FeatureLevel.Level_10_1 }); Console.WriteLine("\t\tCreate swap chain (Run 1)."); SharpDX.DXGI.SwapChain swapChain = new SharpDX.DXGI.SwapChain(factory, device, desc); } catch (Exception e) { Console.WriteLine("EXCEPTION: {0}", e.Message); } try { Console.WriteLine("\t\tCreate device (Run 2)."); SharpDX.Direct3D11.Device device = new SharpDX.Direct3D11.Device(adapter, SharpDX.Direct3D11.DeviceCreationFlags.BgraSupport, new SharpDX.Direct3D.FeatureLevel[] { SharpDX.Direct3D.FeatureLevel.Level_10_1 }); Console.WriteLine("\t\tCreate swap chain (Run 2)."); SharpDX.DXGI.SwapChain swapChain = new SharpDX.DXGI.SwapChain(factory, device, desc); } catch (Exception e) { Console.WriteLine("EXCEPTION: {0}", e.Message); } try { Console.WriteLine("\t\tCreate device (Run 3)."); SharpDX.Direct3D11.Device device = new SharpDX.Direct3D11.Device(adapter); Console.WriteLine("\t\tCreate swap chain (Run 3)."); SharpDX.DXGI.SwapChain swapChain = new SharpDX.DXGI.SwapChain(factory, device, desc); } catch (Exception e) { Console.WriteLine("EXCEPTION: {0}", e.Message); } } } Console.WriteLine("FIN."); Console.ReadLine(); } } In the beginning, I am collecting information about the computer (processor, GPU, .NET Framework version, etc.). The rest should explain itself. I sent him the application and in all three cases, creating the swap chain fails with the same exception. In this test program, I included all solutions that worked for other users. For example, AlexandreMutel said in this forum thread, that device and swap chain need to share the same factory. I did that in my program. So using different factories is not a problem in my case. Laurent Couvidou said here: The player has Windows 7 with .NET Framework 4.6.1, which is good enough to run my test application or game which use .NET Framework 4.5.2. The graphics cards (Radeon HD 6700 Series) is also good enough to run the application. In my test application, I also checked, if Feature Level 10_1 is supported, which is the minimum requirement for my game. A refresh rate of 60 Hz should also be no problem. Therefore, I think the parameters are fine. The remaining calls are just three different ways to create a device and a swap chain for the adapter. All of them throw an exception when creating the swap chain. There are also Battlefield 3 players who had problems with running BF3. As it turned out, BF3 had some issue with Windows region settings. But I believe that's not a problem in my case. There were also compatibility issues in Fallout 4, but my game runs on many other Windows 7 PCs without any problems. Do you have any idea what's wrong? I already have a lot of players that play my game without any issues, so it's not a general problem. I really want to help this player, but I currently can't find a solution.
  12. Hey guys, I'm trying to work on adding transparent objects to my deferred-rendered scene. The only issue is the z-buffer. As far as I know, the standard way to handle this is copying the buffer. In OpenGL, I can just blit it. What's the alternative for DirectX? And are there any alternatives to copying the buffer? Thanks in advance!
  13. Hi, im reading about specular aliasing because of mip maps, as far as i understood it, you need to compute fetched normal lenght and detect now its changed from unit length. I’m currently using BC5 normal maps, so i reconstruct z in shader and therefore my normals are normalized. Can i still somehow use antialiasing or its not needed? Thanks.
  14. I want to change the sampling behaviour to SampleLevel(coord, ddx(coord.y).xx, ddy(coord.y).xx). I was just wondering if it's possible without explicit shader code, e.g. with some flags or so?
  15. Hello, I want to improve the performance of my game (engine) and some of your helped me to make a GPU Profiler. After creating the GPU Profiler, I started to measure the time my GPU needs per frame. I refined my GPU time measurements to find my bottleneck. Searching the bottleneck Rendering a small scene in an Idle state takes around 15.38 ms per frame. 13.54 ms (88.04%) are spent while rendering the scene, 1.57 ms (10.22%) are spent during the SwapChain.Present call (no VSync!) and the rest is spent on other tasks like rendering the UI. I further investigated the scene rendering, since it takes über 88% of my GPU frame rendering time. When rendering my scene, most of the time (80.97%) is spent rendering my models. The rest is spent to render the background/skybox, updating animation data, updating pixel shader constant buffer, etc. It wasn't really suprising that most of the time is spent for my models, so I further refined my measurements to find the actual bottleneck. In my example scene, I have five animated NPCs. When rendering these NPCs, most actions are almost for free. Setting the proper shaders in the input layout (0.11%), updating vertex shader constant buffers (0.32%), setting textures (0.24%) and setting vertex and index buffers (0.28%). However, the rest of the GPU time (99.05% !!) is spent in two function calls: DrawIndexed and DrawIndexedInstance. I searched this forum and the web for other articles and threads about these functions, but I haven't found a lot of useful information. I use SharpDX and .NET Framework 4.5 to develop my game (engine). The developer of SharpDX said, that "The method DrawIndexed in SharpDX is a direct call to DirectX" (Source). DirectX 11 is widely used and SharpDX is "only" a wrapper for DirectX functions, I assume the problem is in my code. How I render my scene When rendering my scene, I render one model after another. Each model has one or more parts and one or more positions. For example, a human model has parts like head, hands, legs, torso, etc. and may be placed in different locations (on the couch, on a street, ...). For static elements like furniture, houses, etc. I use instancing, because the positions never change at run-time. Dynamic models like humans and monster don't use instancing, because positions change over time. When rendering a model, I use this work-flow: Set vertex and pixel shaders, if they need to be updated (e.g. PBR shaders, simple shader, depth info shaders, ...) Set animation data as constant buffer in the vertex shader, if the model is animated Set generic vertex shader constant buffer (world matrix, etc.) Render all parts of the model. For each part: Set diffuse, normal, specular and emissive texture shader views Set vertex buffer Set index buffer Call DrawIndexedInstanced for instanced models and DrawIndexed models What's the problem After my GPU profiling, I know that over 99% of the rendering time for a single model is spent in the DrawIndexedInstanced and DrawIndexed function calls. But why do they take so long? Do I have to try to optimize my vertex or pixel shaders? I do not use other types of shaders at the moment. "Le Comte du Merde-fou" suggested in this post to merge regions of vertices to larger vertex buffers to reduce the number of Draw calls. While this makes sense to me, it does not explain why rendering my five (!) animated models takes that much GPU time. To make sure I don't analyse something I wrong, I made sure to not use the D3D11_CREATE_DEVICE_DEBUG flag and to run as Release version in Visual Studio as suggested by Hodgman in this forum thread. My engine does its job. Multi-texturing, animation, soft shadowing, instancing, etc. are all implemented, but I need to reduce the GPU load for performance reasons. Each frame takes less than 3ms CPU time by the way. So the problem is on the GPU side, I believe.
  16. I was wondering if someone could explain this to me I'm working on using the windows WIC apis to load in textures for DirectX 11. I see that sometimes the WIC Pixel Formats do not directly match a DXGI Format that is used in DirectX. I see that in cases like this the original WIC Pixel Format is converted into a WIC Pixel Format that does directly match a DXGI Format. And doing this conversion is easy, but I do not understand the reason behind 2 of the WIC Pixel Formats that are converted based on Microsoft's guide I was wondering if someone could tell me why Microsoft's guide on this topic says that GUID_WICPixelFormat40bppCMYKAlpha should be converted into GUID_WICPixelFormat64bppRGBA and why GUID_WICPixelFormat80bppCMYKAlpha should be converted into GUID_WICPixelFormat64bppRGBA In one case I would think that: GUID_WICPixelFormat40bppCMYKAlpha would convert to GUID_WICPixelFormat32bppRGBA and that GUID_WICPixelFormat80bppCMYKAlpha would convert to GUID_WICPixelFormat64bppRGBA, because the black channel (k) values would get readded / "swallowed" into into the CMY channels In the second case I would think that: GUID_WICPixelFormat40bppCMYKAlpha would convert to GUID_WICPixelFormat64bppRGBA and that GUID_WICPixelFormat80bppCMYKAlpha would convert to GUID_WICPixelFormat128bppRGBA, because the black channel (k) bits would get redistributed amongst the remaining 4 channels (CYMA) and those "new bits" added to those channels would fit in the GUID_WICPixelFormat64bppRGBA and GUID_WICPixelFormat128bppRGBA formats. But also seeing as there is no GUID_WICPixelFormat128bppRGBA format this case is kind of null and void I basically do not understand why Microsoft says GUID_WICPixelFormat40bppCMYKAlpha and GUID_WICPixelFormat80bppCMYKAlpha should convert to GUID_WICPixelFormat64bppRGBA in the end
  17. Hi, New here. I need some help. My fiance and I like to play this mobile game online that goes by real time. Her and I are always working but when we have free time we like to play this game. We don't always got time throughout the day to Queue Buildings, troops, Upgrades....etc.... I was told to look into DLL Injection and OpenGL/DirectX Hooking. Is this true? Is this what I need to learn? How do I read the Android files, or modify the files, or get the in-game tags/variables for the game I want? Any assistance on this would be most appreciated. I been everywhere and seems no one knows or is to lazy to help me out. It would be nice to have assistance for once. I don't know what I need to learn. So links of topics I need to learn within the comment section would be SOOOOO.....Helpful. Anything to just get me started. Thanks, Dejay Hextrix
  18. In some situations, my game starts to "lag" on older computers. I wanted to search for bottlenecks and optimize my game by searching for flaws in the shaders and in the layer between CPU and GPU. My first step was to measure the time my render function needs to solve its tasks. Every second I wrote the accumulated times of each task into my console window. Each second it takes around 170ms to call render functions for all models (including settings shader resources, updating constant buffers, drawing all indexed and non-indexed vertices, etc.) 40ms to render the UI 790ms to call SwapChain.Present <1ms to do the rest (updating structures, etc.) In my Swap Chain description I set a frame rate of 60 Hz, if it's supported by the computer. It made sense for me that the Present function waits some time until it starts the next frame. However, I wanted to check, if this might be a problem for me. After a web search I found articles like this one, which states My drivers are up-to-date so that's no issue. I installed Microsoft's PIX, but I was unable to use it. I could configure my game for x64, but PIX is not able to process DirectX 11.. After getting only error messages, I installed NVIDIA's NSight. After adjusting my game and installing all components, I couldn't get a proper result, because my game freezes after a few frames. I haven't figured out why. There is no exception or error message and other debug mechanisms like log messages and break points tell me the game freezes at the end of the render function after a few frames. So, I looked for another profiling tool and found Jeremy's GPUProfiler. However, the information returned by this tool is too basic to get an in-depth knowledge about my performance issues. Can anyone recommend a GPU Profiler or any other tool that might help me to find bottlenecks in my game and or that is able to indicate performance problems in my shaders? My custom graphics engine can handle subjects like multi-texturing, instancing, soft shadowing, animation, etc. However, I am pretty sure, there are things I can optimize! I am using SharpDX to develop a game (engine) based on DirectX 11 with .NET Framework 4.5. My graphics cards is from NVIDIA and my processor is made by Intel.
  19. Does buffer number matter in ID3D11DeviceContext::PSSetConstantBuffers()? I added 5 or six constant buffers to my framework, and later realized I had set the buffer number parameter to either 0 or 1 in all of them - but they still all worked! Curious why that is, and should they be set up to correspond to the number of constant buffers? Similarly, inside the buffer structs used to pass info into the hlsl shader, I added padding inside the c++ struct to make a struct containing a float3 be 16 bytes, but in the declaration of the same struct inside the hlsl shader file, it was missing the padding value - and it still worked! Do they need to be consistent or not? Thanks. struct CameraBufferType { XMFLOAT3 cameraPosition; float padding; };
  20. SOLVED: I had written Dispatch(32, 24, 0) instead of Dispatch(32, 24, 1) I'm attempting to implement some basic post-processing in my "engine" and the HLSL part of the Compute Shader and such I think I've understood, however I'm at a loss at how to actually get/use it's output for rendering to the screen. Assume I'm doing something to a UAV in my CS: RWTexture2D<float4> InputOutputMap : register(u0); I want that texture to essentially "be" the backbuffer. I'm pretty certain I'm doing something wrong when I create the views (what I think I'm doing is having the backbuffer be bound as render target aswell as UAV and then using it in my CS): DXGI_SWAP_CHAIN_DESC scd; ZeroMemory(&scd, sizeof(DXGI_SWAP_CHAIN_DESC)); scd.BufferCount = 1; scd.BufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; scd.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT | DXGI_USAGE_SHADER_INPUT | DXGI_USAGE_UNORDERED_ACCESS; scd.OutputWindow = wndHandle; scd.SampleDesc.Count = 1; scd.Windowed = TRUE; HRESULT hr = D3D11CreateDeviceAndSwapChain(NULL, D3D_DRIVER_TYPE_HARDWARE, NULL, NULL, NULL, NULL, D3D11_SDK_VERSION, &scd, &gSwapChain, &gDevice, NULL, &gDeviceContext); // get the address of the back buffer ID3D11Texture2D* pBackBuffer = nullptr; gSwapChain->GetBuffer(0, __uuidof(ID3D11Texture2D), (LPVOID*)&pBackBuffer); // use the back buffer address to create the render target gDevice->CreateRenderTargetView(pBackBuffer, NULL, &gBackbufferRTV); // set the render target as the back buffer CreateDepthStencilBuffer(); gDeviceContext->OMSetRenderTargets(1, &gBackbufferRTV, depthStencilView); //UAV for compute shader D3D11_UNORDERED_ACCESS_VIEW_DESC uavd; ZeroMemory(&uavd, sizeof(uavd)); uavd.Format = DXGI_FORMAT_R8G8B8A8_UNORM; uavd.ViewDimension = D3D11_UAV_DIMENSION_TEXTURE2D; uavd.Texture2D.MipSlice = 1; gDevice->CreateUnorderedAccessView(pBackBuffer, &uavd, &gUAV); pBackBuffer->Release(); After I render the scene, I dispatch like this: gDeviceContext->OMSetRenderTargets(0, NULL, NULL); m_vShaders["cs1"]->Bind(); gDeviceContext->CSSetUnorderedAccessViews(0, 1, &gUAV, 0); gDeviceContext->Dispatch(32, 24, 0); //hard coded ID3D11UnorderedAccessView* nullview = { nullptr }; gDeviceContext->CSSetUnorderedAccessViews(0, 1, &nullview, 0); gDeviceContext->OMSetRenderTargets(1, &gBackbufferRTV, depthStencilView); gSwapChain->Present(0, 0); Worth noting is the scene is rendered as usual, but I dont get any results from the CS (simple gaussian blur) I'm sure it's something fairly basic I'm doing wrong, perhaps my understanding of render targets / views / what have you is just completely wrong and my approach just makes no sense. If someone with more experience could point me in the right direction I would really appreciate it! On a side note, I'd really like to learn more about this kind of stuff. I can really see the potential of the CS aswell as rendering to textures and using them for whatever in the engine so I would love it if you know some good resources I can read about this! Thank you <3 P.S I excluded the .hlsl since I cant imagine that being the issue, but if you think you need it to help me just ask P:P:S. As you can see this is my first post however I do have another account, but I can't log in with it because gamedev.net just keeps asking me to accept terms and then logs me out when I do over and over
  21. I was wondering if anyone could explain the depth buffer and the depth stencil state comparison function to me as I'm a little confused So I have set up a depth stencil state where the DepthFunc is set to D3D11_COMPARISON_LESS, but what am I actually comparing here? What is actually written to the buffer, the pixel that should show up in the front? I have these 2 quad faces, a Red Face and a Blue Face. The Blue Face is further away from the Viewer with a Z index value of -100.0f. Where the Red Face is close to the Viewer with a Z index value of 0.0f. When DepthFunc is set to D3D11_COMPARISON_LESS the Red Face shows up in front of the Blue Face like it should based on the Z index values. BUT if I change the DepthFunc to D3D11_COMPARISON_LESS_EQUAL the Blue Face shows in front of the Red Face. Which does not make sense to me, I would think that when the function is set to D3D11_COMPARISON_LESS_EQUAL the Red Face would still show up in front of the Blue Face as the Z index for the Red Face is still closer to the viewer Am I thinking of this comparison function all wrong? Vertex data just in case //Vertex date that make up the 2 faces Vertex verts[] = { //Red face Vertex(Vector4(0.0f, 0.0f, 0.0f), Color(1.0f, 0.0f, 0.0f)), Vertex(Vector4(100.0f, 100.0f, 0.0f), Color(1.0f, 0.0f, 0.0f)), Vertex(Vector4(100.0f, 0.0f, 0.0f), Color(1.0f, 0.0f, 0.0f)), Vertex(Vector4(0.0f, 0.0f, 0.0f), Color(1.0f, 0.0f, 0.0f)), Vertex(Vector4(0.0f, 100.0f, 0.0f), Color(1.0f, 0.0f, 0.0f)), Vertex(Vector4(100.0f, 100.0f, 0.0f), Color(1.0f, 0.0f, 0.0f)), //Blue face Vertex(Vector4(0.0f, 0.0f, -100.0f), Color(0.0f, 0.0f, 1.0f)), Vertex(Vector4(100.0f, 100.0f, -100.0f), Color(0.0f, 0.0f, 1.0f)), Vertex(Vector4(100.0f, 0.0f, -100.0f), Color(0.0f, 0.0f, 1.0f)), Vertex(Vector4(0.0f, 0.0f, -100.0f), Color(0.0f, 0.0f, 1.0f)), Vertex(Vector4(0.0f, 100.0f, -100.0f), Color(0.0f, 0.0f, 1.0f)), Vertex(Vector4(100.0f, 100.0f, -100.0f), Color(0.0f, 0.0f, 1.0f)), };
  22. Hi all, First time poster here, although I've been reading posts here for quite a while. This place has been invaluable for learning graphics programming -- thanks for a great resource! Right now, I'm working on a graphics abstraction layer for .NET which supports D3D11, Vulkan, and OpenGL at the moment. I have implemented most of my planned features already, and things are working well. Some remaining features that I am planning are Compute Shaders, and some flavor of read-write shader resources. At the moment, my shaders can just get simple read-only access to a uniform (or constant) buffer, a texture, or a sampler. Unfortunately, I'm having a tough time grasping the distinctions between all of the different kinds of read-write resources that are available. In D3D alone, there seem to be 5 or 6 different kinds of resources with similar but different characteristics. On top of that, I get the impression that some of them are more or less "obsoleted" by the newer kinds, and don't have much of a place in modern code. There seem to be a few pivots: The data source/destination (buffer or texture) Read-write or read-only Structured or unstructured (?) Ordered vs unordered (?) These are just my observations based on a lot of MSDN and OpenGL doc reading. For my library, I'm not interested in exposing every possibility to the user -- just trying to find a good "middle-ground" that can be represented cleanly across API's which is good enough for common scenarios. Can anyone give a sort of "overview" of the different options, and perhaps compare/contrast the concepts between Direct3D, OpenGL, and Vulkan? I'd also be very interested in hearing how other folks have abstracted these concepts in their libraries.
  23. If I do a buffer update with MAP_NO_OVERWRITE or MAP_DISCARD, can I just write to the buffer after I called Unmap() on the buffer? It seems to work fine for me (Nvidia driver), but is it actually legal to do so? I have a graphics device wrapper and I don't want to expose Map/Unmap, but just have a function like void* AllocateFromRingBuffer(GPUBuffer* buffer, uint size, uint& offset); This function would just call Map on the buffer, then Unmap immediately and then return the address of the buffer. It usually does a MAP_NO_OVERWRITE, but sometimes it is a WRITE_DISCARD (when the buffer wraps around). Previously I have been using it so that the function expected the data upfront and would copy to the buffer between Map/Unmap, but now I want to extend functionality of it so that it would just return an address to write to.
  24. Trying to write a multitexturing shader in DirectX11 - 3 textures work fine, but adding 4th gets sampled as black! Could you please look at the textureClass.cpp line 79? - I'm guess its D3D11_TEXTURE2D_DESC settings are wrong, but no idea how to set it up right. I tried changing ArraySize from 1 to 4, but does nothing. If thats not the issue, please look at the LightShader_ps - maybe doing something wrong there? Otherwise, no idea. // Setup the description of the texture. textureDesc.Height = height; textureDesc.Width = width; textureDesc.MipLevels = 0; textureDesc.ArraySize = 1; textureDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; textureDesc.SampleDesc.Count = 1; textureDesc.SampleDesc.Quality = 0; textureDesc.Usage = D3D11_USAGE_DEFAULT; textureDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_RENDER_TARGET; textureDesc.CPUAccessFlags = 0; textureDesc.MiscFlags = D3D11_RESOURCE_MISC_GENERATE_MIPS; Please help, thanks. https://github.com/mister51213/DirectX11Engine/blob/master/DirectX11Engine/Texture.cpp
  25. Hello! I would like to introduce Diligent Engine, a project that I've been recently working on. Diligent Engine is a light-weight cross-platform abstraction layer between the application and the platform-specific graphics API. Its main goal is to take advantages of the next-generation APIs such as Direct3D12 and Vulkan, but at the same time provide support for older platforms via Direct3D11, OpenGL and OpenGLES. Diligent Engine exposes common front-end for all supported platforms and provides interoperability with underlying native API. It also supports integration with Unity and is designed to be used as a graphics subsystem in a standalone game engine, Unity native plugin or any other 3D application. It is distributed under Apache 2.0 license and is free to use. Full source code is available for download on GitHub. The engine contains shader source code converter that allows shaders authored in HLSL to be translated to GLSL. The engine currently supports Direct3D11, Direct3D12, and OpenGL/GLES on Win32, Universal Windows and Android platforms. API Basics Initialization The engine can perform initialization of the API or attach to already existing D3D11/D3D12 device or OpenGL/GLES context. For instance, the following code shows how the engine can be initialized in D3D12 mode: #include "RenderDeviceFactoryD3D12.h" using namespace Diligent; // ... GetEngineFactoryD3D12Type GetEngineFactoryD3D12 = nullptr; // Load the dll and import GetEngineFactoryD3D12() function LoadGraphicsEngineD3D12(GetEngineFactoryD3D12); auto *pFactoryD3D11 = GetEngineFactoryD3D12(); EngineD3D12Attribs EngD3D12Attribs; EngD3D12Attribs.CPUDescriptorHeapAllocationSize[0] = 1024; EngD3D12Attribs.CPUDescriptorHeapAllocationSize[1] = 32; EngD3D12Attribs.CPUDescriptorHeapAllocationSize[2] = 16; EngD3D12Attribs.CPUDescriptorHeapAllocationSize[3] = 16; EngD3D12Attribs.NumCommandsToFlushCmdList = 64; RefCntAutoPtr<IRenderDevice> pRenderDevice; RefCntAutoPtr<IDeviceContext> pImmediateContext; SwapChainDesc SwapChainDesc; RefCntAutoPtr<ISwapChain> pSwapChain; pFactoryD3D11->CreateDeviceAndContextsD3D12( EngD3D12Attribs, &pRenderDevice, &pImmediateContext, 0 ); pFactoryD3D11->CreateSwapChainD3D12( pRenderDevice, pImmediateContext, SwapChainDesc, hWnd, &pSwapChain ); Creating Resources Device resources are created by the render device. The two main resource types are buffers, which represent linear memory, and textures, which use memory layouts optimized for fast filtering. To create a buffer, you need to populate BufferDesc structure and call IRenderDevice::CreateBuffer(). The following code creates a uniform (constant) buffer: BufferDesc BuffDesc; BufferDesc.Name = "Uniform buffer"; BuffDesc.BindFlags = BIND_UNIFORM_BUFFER; BuffDesc.Usage = USAGE_DYNAMIC; BuffDesc.uiSizeInBytes = sizeof(ShaderConstants); BuffDesc.CPUAccessFlags = CPU_ACCESS_WRITE; m_pDevice->CreateBuffer( BuffDesc, BufferData(), &m_pConstantBuffer ); Similar, to create a texture, populate TextureDesc structure and call IRenderDevice::CreateTexture() as in the following example: TextureDesc TexDesc; TexDesc.Name = "My texture 2D"; TexDesc.Type = TEXTURE_TYPE_2D; TexDesc.Width = 1024; TexDesc.Height = 1024; TexDesc.Format = TEX_FORMAT_RGBA8_UNORM; TexDesc.Usage = USAGE_DEFAULT; TexDesc.BindFlags = BIND_SHADER_RESOURCE | BIND_RENDER_TARGET | BIND_UNORDERED_ACCESS; TexDesc.Name = "Sample 2D Texture"; m_pRenderDevice->CreateTexture( TexDesc, TextureData(), &m_pTestTex ); Initializing Pipeline State Diligent Engine follows Direct3D12 style to configure the graphics/compute pipeline. One big Pipelines State Object (PSO) encompasses all required states (all shader stages, input layout description, depth stencil, rasterizer and blend state descriptions etc.) Creating Shaders To create a shader, populate ShaderCreationAttribs structure. An important member is ShaderCreationAttribs::SourceLanguage. The following are valid values for this member: SHADER_SOURCE_LANGUAGE_DEFAULT - The shader source format matches the underlying graphics API: HLSL for D3D11 or D3D12 mode, and GLSL for OpenGL and OpenGLES modes. SHADER_SOURCE_LANGUAGE_HLSL - The shader source is in HLSL. For OpenGL and OpenGLES modes, the source code will be converted to GLSL. See shader converter for details. SHADER_SOURCE_LANGUAGE_GLSL - The shader source is in GLSL. There is currently no GLSL to HLSL converter. To allow grouping of resources based on the frequency of expected change, Diligent Engine introduces classification of shader variables: Static variables (SHADER_VARIABLE_TYPE_STATIC) are variables that are expected to be set only once. They may not be changed once a resource is bound to the variable. Such variables are intended to hold global constants such as camera attributes or global light attributes constant buffers. Mutable variables (SHADER_VARIABLE_TYPE_MUTABLE) define resources that are expected to change on a per-material frequency. Examples may include diffuse textures, normal maps etc. Dynamic variables (SHADER_VARIABLE_TYPE_DYNAMIC) are expected to change frequently and randomly. This post describes the resource binding model in Diligent Engine. The following is an example of shader initialization: ShaderCreationAttribs Attrs; Attrs.Desc.Name = "MyPixelShader"; Attrs.FilePath = "MyShaderFile.fx"; Attrs.SearchDirectories = "shaders;shaders\\inc;"; Attrs.EntryPoint = "MyPixelShader"; Attrs.Desc.ShaderType = SHADER_TYPE_PIXEL; Attrs.SourceLanguage = SHADER_SOURCE_LANGUAGE_HLSL; BasicShaderSourceStreamFactory BasicSSSFactory(Attrs.SearchDirectories); Attrs.pShaderSourceStreamFactory = &BasicSSSFactory; ShaderVariableDesc ShaderVars[] = { {"g_StaticTexture", SHADER_VARIABLE_TYPE_STATIC}, {"g_MutableTexture", SHADER_VARIABLE_TYPE_MUTABLE}, {"g_DynamicTexture", SHADER_VARIABLE_TYPE_DYNAMIC} }; Attrs.Desc.VariableDesc = ShaderVars; Attrs.Desc.NumVariables = _countof(ShaderVars); Attrs.Desc.DefaultVariableType = SHADER_VARIABLE_TYPE_STATIC; StaticSamplerDesc StaticSampler; StaticSampler.Desc.MinFilter = FILTER_TYPE_LINEAR; StaticSampler.Desc.MagFilter = FILTER_TYPE_LINEAR; StaticSampler.Desc.MipFilter = FILTER_TYPE_LINEAR; StaticSampler.TextureName = "g_MutableTexture"; Attrs.Desc.NumStaticSamplers = 1; Attrs.Desc.StaticSamplers = &StaticSampler; ShaderMacroHelper Macros; Macros.AddShaderMacro("USE_SHADOWS", 1); Macros.AddShaderMacro("NUM_SHADOW_SAMPLES", 4); Macros.Finalize(); Attrs.Macros = Macros; RefCntAutoPtr<IShader> pShader; m_pDevice->CreateShader( Attrs, &pShader ); Creating the Pipeline State Object To create a pipeline state object, define instance of PipelineStateDesc structure. The structure defines the pipeline specifics such as if the pipeline is a compute pipeline, number and format of render targets as well as depth-stencil format: // This is a graphics pipeline PSODesc.IsComputePipeline = false; PSODesc.GraphicsPipeline.NumRenderTargets = 1; PSODesc.GraphicsPipeline.RTVFormats[0] = TEX_FORMAT_RGBA8_UNORM_SRGB; PSODesc.GraphicsPipeline.DSVFormat = TEX_FORMAT_D32_FLOAT; The structure also defines depth-stencil, rasterizer, blend state, input layout and other parameters. For instance, rasterizer state can be defined as in the code snippet below: // Init rasterizer state RasterizerStateDesc &RasterizerDesc = PSODesc.GraphicsPipeline.RasterizerDesc; RasterizerDesc.FillMode = FILL_MODE_SOLID; RasterizerDesc.CullMode = CULL_MODE_NONE; RasterizerDesc.FrontCounterClockwise = True; RasterizerDesc.ScissorEnable = True; //RSDesc.MultisampleEnable = false; // do not allow msaa (fonts would be degraded) RasterizerDesc.AntialiasedLineEnable = False; When all fields are populated, call IRenderDevice::CreatePipelineState() to create the PSO: m_pDev->CreatePipelineState(PSODesc, &m_pPSO); Binding Shader Resources Shader resource binding in Diligent Engine is based on grouping variables in 3 different groups (static, mutable and dynamic). Static variables are variables that are expected to be set only once. They may not be changed once a resource is bound to the variable. Such variables are intended to hold global constants such as camera attributes or global light attributes constant buffers. They are bound directly to the shader object: PixelShader->GetShaderVariable( "g_tex2DShadowMap" )->Set( pShadowMapSRV ); Mutable and dynamic variables are bound via a new object called Shader Resource Binding (SRB), which is created by the pipeline state: m_pPSO->CreateShaderResourceBinding(&m_pSRB); Dynamic and mutable resources are then bound through SRB object: m_pSRB->GetVariable(SHADER_TYPE_VERTEX, "tex2DDiffuse")->Set(pDiffuseTexSRV); m_pSRB->GetVariable(SHADER_TYPE_VERTEX, "cbRandomAttribs")->Set(pRandomAttrsCB); The difference between mutable and dynamic resources is that mutable ones can only be set once for every instance of a shader resource binding. Dynamic resources can be set multiple times. It is important to properly set the variable type as this may affect performance. Static variables are generally most efficient, followed by mutable. Dynamic variables are most expensive from performance point of view. This post explains shader resource binding in more details. Setting the Pipeline State and Invoking Draw Command Before any draw command can be invoked, all required vertex and index buffers as well as the pipeline state should be bound to the device context: // Clear render target const float zero[4] = {0, 0, 0, 0}; m_pContext->ClearRenderTarget(nullptr, zero); // Set vertex and index buffers IBuffer *buffer[] = {m_pVertexBuffer}; Uint32 offsets[] = {0}; Uint32 strides[] = {sizeof(MyVertex)}; m_pContext->SetVertexBuffers(0, 1, buffer, strides, offsets, SET_VERTEX_BUFFERS_FLAG_RESET); m_pContext->SetIndexBuffer(m_pIndexBuffer, 0); m_pContext->SetPipelineState(m_pPSO); Also, all shader resources must be committed to the device context: m_pContext->CommitShaderResources(m_pSRB, COMMIT_SHADER_RESOURCES_FLAG_TRANSITION_RESOURCES); When all required states and resources are bound, IDeviceContext::Draw() can be used to execute draw command or IDeviceContext::DispatchCompute() can be used to execute compute command. Note that for a draw command, graphics pipeline must be bound, and for dispatch command, compute pipeline must be bound. Draw() takes DrawAttribs structure as an argument. The structure members define all attributes required to perform the command (primitive topology, number of vertices or indices, if draw call is indexed or not, if draw call is instanced or not, if draw call is indirect or not, etc.). For example: DrawAttribs attrs; attrs.IsIndexed = true; attrs.IndexType = VT_UINT16; attrs.NumIndices = 36; attrs.Topology = PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; pContext->Draw(attrs); Build Instructions Please visit this page for detailed build instructions. Samples The engine contains two graphics samples that demonstrate how the API can be used. AntTweakBar sample demonstrates how to use AntTweakBar library to create simple user interface. It can also be thought of as Diligent Engine’s “Hello World” example. Atmospheric scattering sample is a more advanced one. It demonstrates how Diligent Engine can be used to implement various rendering tasks: loading textures from files, using complex shaders, rendering to textures, using compute shaders and unordered access views, etc. The engine also includes Asteroids performance benchmark based on this demo developed by Intel. It renders 50,000 unique textured asteroids and lets compare performance of D3D11 and D3D12 implementations. Every asteroid is a combination of one of 1000 unique meshes and one of 10 unique textures. Integration with Unity Diligent Engine supports integration with Unity through Unity low-level native plugin interface. The engine relies on Native API Interoperability to attach to the graphics API initialized by Unity. After Diligent Engine device and context are created, they can be used us usual to create resources and issue rendering commands. GhostCubePlugin shows an example how Diligent Engine can be used to render a ghost cube only visible as a reflection in a mirror.