Search the Community

Showing results for tags 'DX11' in content posted in Graphics and GPU Programming.



More search options

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Categories

  • Audio
    • Music and Sound FX
  • Business
    • Business and Law
    • Career Development
    • Production and Management
  • Game Design
    • Game Design and Theory
    • Writing for Games
    • UX for Games
  • Industry
    • Interviews
    • Event Coverage
  • Programming
    • Artificial Intelligence
    • General and Gameplay Programming
    • Graphics and GPU Programming
    • Engines and Middleware
    • Math and Physics
    • Networking and Multiplayer
  • Visual Arts
  • Archive

Categories

  • News

Categories

  • Audio
  • Visual Arts
  • Programming
  • Writing

Categories

  • Audio Jobs
  • Business Jobs
  • Game Design Jobs
  • Programming Jobs
  • Visual Arts Jobs

Categories

  • GameDev Unboxed

Forums

  • Audio
    • Music and Sound FX
  • Business
    • Games Career Development
    • Production and Management
    • Games Business and Law
  • Game Design
    • Game Design and Theory
    • Writing for Games
  • Programming
    • Artificial Intelligence
    • Engines and Middleware
    • General and Gameplay Programming
    • Graphics and GPU Programming
    • Math and Physics
    • Networking and Multiplayer
  • Visual Arts
    • 2D and 3D Art
    • Critique and Feedback
  • Topical
    • Virtual and Augmented Reality
    • News
  • Community
    • GameDev Challenges
    • For Beginners
    • GDNet+ Member Forum
    • GDNet Lounge
    • GDNet Comments, Suggestions, and Ideas
    • Coding Horrors
    • Your Announcements
    • Hobby Project Classifieds
    • Indie Showcase
    • Article Writing
  • Affiliates
    • NeHe Productions
    • AngelCode
  • Workshops
    • C# Workshop
    • CPP Workshop
    • Freehand Drawing Workshop
    • Hands-On Interactive Game Development
    • SICP Workshop
    • XNA 4.0 Workshop
  • Archive
    • Topical
    • Affiliates
    • Contests
    • Technical

Calendars

  • Community Calendar
  • Games Industry Events
  • Game Jams

Blogs

There are no results to display.

There are no results to display.

Marker Groups

  • Members

Developers


Group


About Me


Website


Industry Role


Twitter


Github


Twitch


Steam

Found 1348 results

  1. Hi, New here. I need some help. My fiance and I like to play this mobile game online that goes by real time. Her and I are always working but when we have free time we like to play this game. We don't always got time throughout the day to Queue Buildings, troops, Upgrades....etc.... I was told to look into DLL Injection and OpenGL/DirectX Hooking. Is this true? Is this what I need to learn? How do I read the Android files, or modify the files, or get the in-game tags/variables for the game I want? Any assistance on this would be most appreciated. I been everywhere and seems no one knows or is to lazy to help me out. It would be nice to have assistance for once. I don't know what I need to learn. So links of topics I need to learn within the comment section would be SOOOOO.....Helpful. Anything to just get me started. Thanks, Dejay Hextrix
  2. Hello, I want to improve the performance of my game (engine) and some of your helped me to make a GPU Profiler. After creating the GPU Profiler, I started to measure the time my GPU needs per frame. I refined my GPU time measurements to find my bottleneck. Searching the bottleneck Rendering a small scene in an Idle state takes around 15.38 ms per frame. 13.54 ms (88.04%) are spent while rendering the scene, 1.57 ms (10.22%) are spent during the SwapChain.Present call (no VSync!) and the rest is spent on other tasks like rendering the UI. I further investigated the scene rendering, since it takes über 88% of my GPU frame rendering time. When rendering my scene, most of the time (80.97%) is spent rendering my models. The rest is spent to render the background/skybox, updating animation data, updating pixel shader constant buffer, etc. It wasn't really suprising that most of the time is spent for my models, so I further refined my measurements to find the actual bottleneck. In my example scene, I have five animated NPCs. When rendering these NPCs, most actions are almost for free. Setting the proper shaders in the input layout (0.11%), updating vertex shader constant buffers (0.32%), setting textures (0.24%) and setting vertex and index buffers (0.28%). However, the rest of the GPU time (99.05% !!) is spent in two function calls: DrawIndexed and DrawIndexedInstance. I searched this forum and the web for other articles and threads about these functions, but I haven't found a lot of useful information. I use SharpDX and .NET Framework 4.5 to develop my game (engine). The developer of SharpDX said, that "The method DrawIndexed in SharpDX is a direct call to DirectX" (Source). DirectX 11 is widely used and SharpDX is "only" a wrapper for DirectX functions, I assume the problem is in my code. How I render my scene When rendering my scene, I render one model after another. Each model has one or more parts and one or more positions. For example, a human model has parts like head, hands, legs, torso, etc. and may be placed in different locations (on the couch, on a street, ...). For static elements like furniture, houses, etc. I use instancing, because the positions never change at run-time. Dynamic models like humans and monster don't use instancing, because positions change over time. When rendering a model, I use this work-flow: Set vertex and pixel shaders, if they need to be updated (e.g. PBR shaders, simple shader, depth info shaders, ...) Set animation data as constant buffer in the vertex shader, if the model is animated Set generic vertex shader constant buffer (world matrix, etc.) Render all parts of the model. For each part: Set diffuse, normal, specular and emissive texture shader views Set vertex buffer Set index buffer Call DrawIndexedInstanced for instanced models and DrawIndexed models What's the problem After my GPU profiling, I know that over 99% of the rendering time for a single model is spent in the DrawIndexedInstanced and DrawIndexed function calls. But why do they take so long? Do I have to try to optimize my vertex or pixel shaders? I do not use other types of shaders at the moment. "Le Comte du Merde-fou" suggested in this post to merge regions of vertices to larger vertex buffers to reduce the number of Draw calls. While this makes sense to me, it does not explain why rendering my five (!) animated models takes that much GPU time. To make sure I don't analyse something I wrong, I made sure to not use the D3D11_CREATE_DEVICE_DEBUG flag and to run as Release version in Visual Studio as suggested by Hodgman in this forum thread. My engine does its job. Multi-texturing, animation, soft shadowing, instancing, etc. are all implemented, but I need to reduce the GPU load for performance reasons. Each frame takes less than 3ms CPU time by the way. So the problem is on the GPU side, I believe.
  3. I want to change the sampling behaviour to SampleLevel(coord, ddx(coord.y).xx, ddy(coord.y).xx). I was just wondering if it's possible without explicit shader code, e.g. with some flags or so?
  4. In some situations, my game starts to "lag" on older computers. I wanted to search for bottlenecks and optimize my game by searching for flaws in the shaders and in the layer between CPU and GPU. My first step was to measure the time my render function needs to solve its tasks. Every second I wrote the accumulated times of each task into my console window. Each second it takes around 170ms to call render functions for all models (including settings shader resources, updating constant buffers, drawing all indexed and non-indexed vertices, etc.) 40ms to render the UI 790ms to call SwapChain.Present <1ms to do the rest (updating structures, etc.) In my Swap Chain description I set a frame rate of 60 Hz, if it's supported by the computer. It made sense for me that the Present function waits some time until it starts the next frame. However, I wanted to check, if this might be a problem for me. After a web search I found articles like this one, which states My drivers are up-to-date so that's no issue. I installed Microsoft's PIX, but I was unable to use it. I could configure my game for x64, but PIX is not able to process DirectX 11.. After getting only error messages, I installed NVIDIA's NSight. After adjusting my game and installing all components, I couldn't get a proper result, because my game freezes after a few frames. I haven't figured out why. There is no exception or error message and other debug mechanisms like log messages and break points tell me the game freezes at the end of the render function after a few frames. So, I looked for another profiling tool and found Jeremy's GPUProfiler. However, the information returned by this tool is too basic to get an in-depth knowledge about my performance issues. Can anyone recommend a GPU Profiler or any other tool that might help me to find bottlenecks in my game and or that is able to indicate performance problems in my shaders? My custom graphics engine can handle subjects like multi-texturing, instancing, soft shadowing, animation, etc. However, I am pretty sure, there are things I can optimize! I am using SharpDX to develop a game (engine) based on DirectX 11 with .NET Framework 4.5. My graphics cards is from NVIDIA and my processor is made by Intel.
  5. I was wondering if someone could explain this to me I'm working on using the windows WIC apis to load in textures for DirectX 11. I see that sometimes the WIC Pixel Formats do not directly match a DXGI Format that is used in DirectX. I see that in cases like this the original WIC Pixel Format is converted into a WIC Pixel Format that does directly match a DXGI Format. And doing this conversion is easy, but I do not understand the reason behind 2 of the WIC Pixel Formats that are converted based on Microsoft's guide I was wondering if someone could tell me why Microsoft's guide on this topic says that GUID_WICPixelFormat40bppCMYKAlpha should be converted into GUID_WICPixelFormat64bppRGBA and why GUID_WICPixelFormat80bppCMYKAlpha should be converted into GUID_WICPixelFormat64bppRGBA In one case I would think that: GUID_WICPixelFormat40bppCMYKAlpha would convert to GUID_WICPixelFormat32bppRGBA and that GUID_WICPixelFormat80bppCMYKAlpha would convert to GUID_WICPixelFormat64bppRGBA, because the black channel (k) values would get readded / "swallowed" into into the CMY channels In the second case I would think that: GUID_WICPixelFormat40bppCMYKAlpha would convert to GUID_WICPixelFormat64bppRGBA and that GUID_WICPixelFormat80bppCMYKAlpha would convert to GUID_WICPixelFormat128bppRGBA, because the black channel (k) bits would get redistributed amongst the remaining 4 channels (CYMA) and those "new bits" added to those channels would fit in the GUID_WICPixelFormat64bppRGBA and GUID_WICPixelFormat128bppRGBA formats. But also seeing as there is no GUID_WICPixelFormat128bppRGBA format this case is kind of null and void I basically do not understand why Microsoft says GUID_WICPixelFormat40bppCMYKAlpha and GUID_WICPixelFormat80bppCMYKAlpha should convert to GUID_WICPixelFormat64bppRGBA in the end
  6. Does buffer number matter in ID3D11DeviceContext::PSSetConstantBuffers()? I added 5 or six constant buffers to my framework, and later realized I had set the buffer number parameter to either 0 or 1 in all of them - but they still all worked! Curious why that is, and should they be set up to correspond to the number of constant buffers? Similarly, inside the buffer structs used to pass info into the hlsl shader, I added padding inside the c++ struct to make a struct containing a float3 be 16 bytes, but in the declaration of the same struct inside the hlsl shader file, it was missing the padding value - and it still worked! Do they need to be consistent or not? Thanks. struct CameraBufferType { XMFLOAT3 cameraPosition; float padding; };
  7. Hi guys, anyone experienced with DX11 could look at my graphics.cpp class? I got fonts rendering correctly with painters algorithm - painting over the other 3d stuff each frame, however, whenever I turn the camera left or right, the fonts get smushed narrower and narrower, then disappear completely. It seems like the fix must be a very small change, untying their rendering from the cam direction, but I just can't figure out how to do it under all this rendering complexity. Any tips would be helpful, thanks. https://github.com/mister51213/DirectX11Engine/blob/master/DirectX11Engine/Graphics.cpp
  8. SOLVED: I had written Dispatch(32, 24, 0) instead of Dispatch(32, 24, 1) I'm attempting to implement some basic post-processing in my "engine" and the HLSL part of the Compute Shader and such I think I've understood, however I'm at a loss at how to actually get/use it's output for rendering to the screen. Assume I'm doing something to a UAV in my CS: RWTexture2D<float4> InputOutputMap : register(u0); I want that texture to essentially "be" the backbuffer. I'm pretty certain I'm doing something wrong when I create the views (what I think I'm doing is having the backbuffer be bound as render target aswell as UAV and then using it in my CS): DXGI_SWAP_CHAIN_DESC scd; ZeroMemory(&scd, sizeof(DXGI_SWAP_CHAIN_DESC)); scd.BufferCount = 1; scd.BufferDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; scd.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT | DXGI_USAGE_SHADER_INPUT | DXGI_USAGE_UNORDERED_ACCESS; scd.OutputWindow = wndHandle; scd.SampleDesc.Count = 1; scd.Windowed = TRUE; HRESULT hr = D3D11CreateDeviceAndSwapChain(NULL, D3D_DRIVER_TYPE_HARDWARE, NULL, NULL, NULL, NULL, D3D11_SDK_VERSION, &scd, &gSwapChain, &gDevice, NULL, &gDeviceContext); // get the address of the back buffer ID3D11Texture2D* pBackBuffer = nullptr; gSwapChain->GetBuffer(0, __uuidof(ID3D11Texture2D), (LPVOID*)&pBackBuffer); // use the back buffer address to create the render target gDevice->CreateRenderTargetView(pBackBuffer, NULL, &gBackbufferRTV); // set the render target as the back buffer CreateDepthStencilBuffer(); gDeviceContext->OMSetRenderTargets(1, &gBackbufferRTV, depthStencilView); //UAV for compute shader D3D11_UNORDERED_ACCESS_VIEW_DESC uavd; ZeroMemory(&uavd, sizeof(uavd)); uavd.Format = DXGI_FORMAT_R8G8B8A8_UNORM; uavd.ViewDimension = D3D11_UAV_DIMENSION_TEXTURE2D; uavd.Texture2D.MipSlice = 1; gDevice->CreateUnorderedAccessView(pBackBuffer, &uavd, &gUAV); pBackBuffer->Release(); After I render the scene, I dispatch like this: gDeviceContext->OMSetRenderTargets(0, NULL, NULL); m_vShaders["cs1"]->Bind(); gDeviceContext->CSSetUnorderedAccessViews(0, 1, &gUAV, 0); gDeviceContext->Dispatch(32, 24, 0); //hard coded ID3D11UnorderedAccessView* nullview = { nullptr }; gDeviceContext->CSSetUnorderedAccessViews(0, 1, &nullview, 0); gDeviceContext->OMSetRenderTargets(1, &gBackbufferRTV, depthStencilView); gSwapChain->Present(0, 0); Worth noting is the scene is rendered as usual, but I dont get any results from the CS (simple gaussian blur) I'm sure it's something fairly basic I'm doing wrong, perhaps my understanding of render targets / views / what have you is just completely wrong and my approach just makes no sense. If someone with more experience could point me in the right direction I would really appreciate it! On a side note, I'd really like to learn more about this kind of stuff. I can really see the potential of the CS aswell as rendering to textures and using them for whatever in the engine so I would love it if you know some good resources I can read about this! Thank you <3 P.S I excluded the .hlsl since I cant imagine that being the issue, but if you think you need it to help me just ask P:P:S. As you can see this is my first post however I do have another account, but I can't log in with it because gamedev.net just keeps asking me to accept terms and then logs me out when I do over and over
  9. I was wondering if anyone could explain the depth buffer and the depth stencil state comparison function to me as I'm a little confused So I have set up a depth stencil state where the DepthFunc is set to D3D11_COMPARISON_LESS, but what am I actually comparing here? What is actually written to the buffer, the pixel that should show up in the front? I have these 2 quad faces, a Red Face and a Blue Face. The Blue Face is further away from the Viewer with a Z index value of -100.0f. Where the Red Face is close to the Viewer with a Z index value of 0.0f. When DepthFunc is set to D3D11_COMPARISON_LESS the Red Face shows up in front of the Blue Face like it should based on the Z index values. BUT if I change the DepthFunc to D3D11_COMPARISON_LESS_EQUAL the Blue Face shows in front of the Red Face. Which does not make sense to me, I would think that when the function is set to D3D11_COMPARISON_LESS_EQUAL the Red Face would still show up in front of the Blue Face as the Z index for the Red Face is still closer to the viewer Am I thinking of this comparison function all wrong? Vertex data just in case //Vertex date that make up the 2 faces Vertex verts[] = { //Red face Vertex(Vector4(0.0f, 0.0f, 0.0f), Color(1.0f, 0.0f, 0.0f)), Vertex(Vector4(100.0f, 100.0f, 0.0f), Color(1.0f, 0.0f, 0.0f)), Vertex(Vector4(100.0f, 0.0f, 0.0f), Color(1.0f, 0.0f, 0.0f)), Vertex(Vector4(0.0f, 0.0f, 0.0f), Color(1.0f, 0.0f, 0.0f)), Vertex(Vector4(0.0f, 100.0f, 0.0f), Color(1.0f, 0.0f, 0.0f)), Vertex(Vector4(100.0f, 100.0f, 0.0f), Color(1.0f, 0.0f, 0.0f)), //Blue face Vertex(Vector4(0.0f, 0.0f, -100.0f), Color(0.0f, 0.0f, 1.0f)), Vertex(Vector4(100.0f, 100.0f, -100.0f), Color(0.0f, 0.0f, 1.0f)), Vertex(Vector4(100.0f, 0.0f, -100.0f), Color(0.0f, 0.0f, 1.0f)), Vertex(Vector4(0.0f, 0.0f, -100.0f), Color(0.0f, 0.0f, 1.0f)), Vertex(Vector4(0.0f, 100.0f, -100.0f), Color(0.0f, 0.0f, 1.0f)), Vertex(Vector4(100.0f, 100.0f, -100.0f), Color(0.0f, 0.0f, 1.0f)), };
  10. Hi all, First time poster here, although I've been reading posts here for quite a while. This place has been invaluable for learning graphics programming -- thanks for a great resource! Right now, I'm working on a graphics abstraction layer for .NET which supports D3D11, Vulkan, and OpenGL at the moment. I have implemented most of my planned features already, and things are working well. Some remaining features that I am planning are Compute Shaders, and some flavor of read-write shader resources. At the moment, my shaders can just get simple read-only access to a uniform (or constant) buffer, a texture, or a sampler. Unfortunately, I'm having a tough time grasping the distinctions between all of the different kinds of read-write resources that are available. In D3D alone, there seem to be 5 or 6 different kinds of resources with similar but different characteristics. On top of that, I get the impression that some of them are more or less "obsoleted" by the newer kinds, and don't have much of a place in modern code. There seem to be a few pivots: The data source/destination (buffer or texture) Read-write or read-only Structured or unstructured (?) Ordered vs unordered (?) These are just my observations based on a lot of MSDN and OpenGL doc reading. For my library, I'm not interested in exposing every possibility to the user -- just trying to find a good "middle-ground" that can be represented cleanly across API's which is good enough for common scenarios. Can anyone give a sort of "overview" of the different options, and perhaps compare/contrast the concepts between Direct3D, OpenGL, and Vulkan? I'd also be very interested in hearing how other folks have abstracted these concepts in their libraries.
  11. If I do a buffer update with MAP_NO_OVERWRITE or MAP_DISCARD, can I just write to the buffer after I called Unmap() on the buffer? It seems to work fine for me (Nvidia driver), but is it actually legal to do so? I have a graphics device wrapper and I don't want to expose Map/Unmap, but just have a function like void* AllocateFromRingBuffer(GPUBuffer* buffer, uint size, uint& offset); This function would just call Map on the buffer, then Unmap immediately and then return the address of the buffer. It usually does a MAP_NO_OVERWRITE, but sometimes it is a WRITE_DISCARD (when the buffer wraps around). Previously I have been using it so that the function expected the data upfront and would copy to the buffer between Map/Unmap, but now I want to extend functionality of it so that it would just return an address to write to.
  12. Trying to write a multitexturing shader in DirectX11 - 3 textures work fine, but adding 4th gets sampled as black! Could you please look at the textureClass.cpp line 79? - I'm guess its D3D11_TEXTURE2D_DESC settings are wrong, but no idea how to set it up right. I tried changing ArraySize from 1 to 4, but does nothing. If thats not the issue, please look at the LightShader_ps - maybe doing something wrong there? Otherwise, no idea. // Setup the description of the texture. textureDesc.Height = height; textureDesc.Width = width; textureDesc.MipLevels = 0; textureDesc.ArraySize = 1; textureDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM; textureDesc.SampleDesc.Count = 1; textureDesc.SampleDesc.Quality = 0; textureDesc.Usage = D3D11_USAGE_DEFAULT; textureDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_RENDER_TARGET; textureDesc.CPUAccessFlags = 0; textureDesc.MiscFlags = D3D11_RESOURCE_MISC_GENERATE_MIPS; Please help, thanks. https://github.com/mister51213/DirectX11Engine/blob/master/DirectX11Engine/Texture.cpp
  13. Can someone help out with this. The code builds but I get "Exception thrown: Read Access Violation" and says my index buffer was a nullptr. I'm going to attach my code and a screenshot of the error below or above. Any help is greatly appreciated. //------------------------------- //ヘッダファイル //------------------------------- #include "manager.h" #include "renderer.h" #include "dome.h" #include "camera.h" //------------------------------- //コンストラクタ //------------------------------- CDome::CDome() { m_pIndxBuff = nullptr; m_pVtxBuff = nullptr; m_HorizontalGrid = NULL; m_VerticalGrid = NULL; // ワールドの位置・拡大・回転を設定 m_Scale = D3DXVECTOR3(1.0f, 1.0f, 1.0f); m_Pos = D3DXVECTOR3(0.0f, 0.0f, 0.0f); //m_Rotate = 0.0f; } CDome::CDome(int HorizontalGrid, int VerticalGrid, float Length) { m_pIndxBuff = nullptr; m_pVtxBuff = nullptr; m_HorizontalGrid = HorizontalGrid; m_VerticalGrid = VerticalGrid; // ワールドの位置・拡大・回転を設定 m_Scale = D3DXVECTOR3(1.0f, 1.0f, 1.0f); m_Pos = D3DXVECTOR3(0.0f, 0.0f, 0.0f); m_Length = Length; } CDome::CDome(int HorizontalGrid, int VerticalGrid, float Length, D3DXVECTOR3 Pos) { m_pIndxBuff = nullptr; m_pVtxBuff = nullptr; m_HorizontalGrid = HorizontalGrid; m_VerticalGrid = VerticalGrid; // ワールドの位置・拡大・回転を設定 m_Scale = D3DXVECTOR3(1.0f, 1.0f, 1.0f); m_Pos = Pos; m_Length = Length; } //------------------------------- //デストラクタ //------------------------------- CDome::~CDome() { } //------------------------------- //初期化処理 //------------------------------- void CDome::Init(void) { LPDIRECT3DDEVICE9 pDevice; pDevice = CManager::GetRenderer()->GetDevice(); m_VtxNum = (m_HorizontalGrid + 1) * (m_VerticalGrid + 1); m_IndxNum = (m_HorizontalGrid * 2 + 2) * m_VerticalGrid + (m_VerticalGrid - 1) * 2; // テクスチャの生成 if (FAILED(D3DXCreateTextureFromFile(pDevice, "data/TEXTURE/dome.jpg", &m_pTexture))) { MessageBox(NULL, "Couldn't read Texture file destination", "Error Loading Texture", MB_OK | MB_ICONHAND); } //頂点バッファの作成 if (FAILED(pDevice->CreateVertexBuffer(sizeof(VERTEX_3D) * m_VtxNum, D3DUSAGE_WRITEONLY, FVF_VERTEX_3D, D3DPOOL_MANAGED, &m_pVtxBuff, NULL))) //作成した頂点バッファのサイズ { MessageBox(NULL, "Error making VertexBuffer", "Error", MB_OK); } //インデクスバッファの作成 if (FAILED(pDevice->CreateIndexBuffer(sizeof(VERTEX_3D) * m_IndxNum, D3DUSAGE_WRITEONLY, D3DFMT_INDEX16, D3DPOOL_MANAGED, &m_pIndxBuff, NULL))) { MessageBox(NULL, "Error making IndexBuffer", "Error", MB_OK); } VERTEX_3D *pVtx; //仮想アドレス用ポインターVertex WORD *pIndx; //仮想アドレス用ポインターIndex //頂点バッファをロックして仮想アドレスを取得する。 m_pVtxBuff->Lock(0, 0, (void**)&pVtx, 0); //インデクスをロックして仮想アドレスを取得する。 m_pIndxBuff->Lock(0, 0, (void**)&pIndx, 0); for (int i = 0; i < (m_VerticalGrid + 1); i++) { for (int j = 0; j < (m_HorizontalGrid + 1); j++) { pVtx[0].pos = D3DXVECTOR3(m_Length * sinf(i * (D3DX_PI * 0.5f / ((int)m_VerticalGrid - 1))) * sinf(j * (D3DX_PI * 2 / ((int)m_HorizontalGrid - 1))), m_Length * cosf(i * (D3DX_PI * 0.5f / ((int)m_VerticalGrid - 1))), m_Length * sinf(i * (D3DX_PI* 0.5f / ((int)m_VerticalGrid - 1))) * cosf(j * (D3DX_PI * 2 / ((int)m_HorizontalGrid - 1)))); D3DXVECTOR3 tempNormalize; D3DXVec3Normalize(&tempNormalize, &pVtx[0].pos); pVtx[0].normal = -tempNormalize; pVtx[0].color = D3DXCOLOR(255, 255, 255, 255); pVtx[0].tex = D3DXVECTOR2((float)j / (m_HorizontalGrid - 1), (float)i / (m_VerticalGrid - 1)); pVtx++; } } for (int i = 0; i < m_VerticalGrid; i++) { if (i != 0) { pIndx[0] = ((m_HorizontalGrid + 1) * (i + 1)); pIndx++; } for (int j = 0; j < (m_HorizontalGrid + 1); j++) { pIndx[0] = ((m_HorizontalGrid + 1) * (i + 1)) + j; pIndx[1] = ((m_HorizontalGrid + 1) * i) + j; pIndx += 2; } if (i + 1 != m_VerticalGrid) { pIndx[0] = pIndx[-1]; pIndx++; } } //インデクスをアンロックする m_pIndxBuff->Unlock(); //頂点バッファをアンロックする m_pVtxBuff->Unlock(); } //------------------------------- //終了処理 //------------------------------- void CDome::Uninit(void) { // 頂点バッファの破棄 SAFE_RELEASE(m_pVtxBuff); // インデクスの破棄 SAFE_RELEASE(m_pIndxBuff); Release(); } //------------------------------- //更新処理 //------------------------------- void CDome::Update(void) { m_Pos = CManager::GetCamera()->GetCameraPosEye(); } //------------------------------- //描画処理 //------------------------------- void CDome::Draw(void) { LPDIRECT3DDEVICE9 pDevice; pDevice = CManager::GetRenderer()->GetDevice(); D3DXMATRIX mtxWorld; D3DXMATRIX mtxPos; D3DXMATRIX mtxScale; D3DXMATRIX mtxRotation; // ワールドID D3DXMatrixIdentity(&mtxWorld); // 3D拡大行列 D3DXMatrixScaling(&mtxScale, m_Scale.x, m_Scale.y, m_Scale.z); D3DXMatrixMultiply(&mtxWorld, &mtxWorld, &mtxScale); // 3D平行移動行列 D3DXMatrixTranslation(&mtxPos, m_Pos.x, m_Pos.y + 70.0f, m_Pos.z); D3DXMatrixMultiply(&mtxWorld, &mtxWorld, &mtxPos); // ワールド座標変換 pDevice->SetTransform(D3DTS_WORLD, &mtxWorld); // 頂点バッファをデータストリームに設定 pDevice->SetStreamSource(0, m_pVtxBuff, 0, sizeof(VERTEX_3D)); // 頂点フォーマットの設定 pDevice->SetFVF(FVF_VERTEX_3D); // テクスチャの設定 pDevice->SetTexture(0, m_pTexture); // インデクスの設定 pDevice->SetIndices(m_pIndxBuff); // カラーが見えるようにライトを消す pDevice->SetRenderState(D3DRS_LIGHTING, FALSE); // ポリゴンの描画 pDevice->DrawIndexedPrimitive(D3DPT_TRIANGLESTRIP, 0, 0, m_VtxNum, 0, m_IndxNum - 2); // ライトを元に戻す pDevice->SetRenderState(D3DRS_LIGHTING, TRUE); } //------------------------------- //Create MeshDome //------------------------------- CDome *CDome::Create(int HorizontalGrid, int VerticalGrid, float Length) { CDome *pMeshDome; pMeshDome = new CDome(HorizontalGrid, VerticalGrid, Length); pMeshDome->Init(); return pMeshDome; } CDome *CDome::Create(int HorizontalGrid, int VerticalGrid, float Length, D3DXVECTOR3 Pos) { CDome *pMeshDome; pMeshDome = new CDome(HorizontalGrid, VerticalGrid, Length, Pos); pMeshDome->Init(); return pMeshDome; }
  14. I have to learn DirectX for a course I am studying. This book https://www.amazon.co.uk/Introduction-3D-Game-Programming-Directx/dp/1936420228 I felt would be great for me to learn from. The trouble is the examples which are all offered here http://www.d3dcoder.net/d3d11.htm . They do not work for me. This is a known issue as there is a link on the examples page saying how to fix it. I'm having difficulty with doing this though. This is the page with the solution http://www.d3dcoder.net/Data/Book4/d3d11Win10.htm. The reason why this problem is happening, the book was released before Windows 10 was released. Now when the examples are run they need slight fixes in order for them to even work. I just can't get these examples working at all. Would anyone be able to help me get the examples working please. I am running Windows 10 also just to make this clear, so this is why the examples are experiencing the not so desired behaviour. I just wish they would work straight away but there seems to be issues with the examples from this book mainly because of it trying to run from a Windows 10 OS. On top of this, if anyone has any suggestions with how I can learn DirectX 11 i would be most grateful. Thanks very much. I really would like to get them examples working to though from the book I mentioned. Look forward to reading any replies this thread receives. GameDevCoder. PS - If anyone has noticed. I asked this about 1 year ago also but this was when I was dabbling in it. Now I am actually needing to produce some stuff with DirectX so I have to get my head round this now. I felt at the time that I sort of understood what was being written to me in response to my thread back then. I had always been a little unsure though of being absolutely sure of what was happening with these troublesome examples. So I am really just trying to get to the bottom of this now. If anyone can help me work these examples out so I can see them working then hopefully I can learn DirectX 11 from them. *SOLUTION* - I was able to get the examples running thanks to the gamedev.net community. Great work guys. I'm so please now that I can learn from this book now I have the examples running. https://www.gamedev.net/forums/topic/693437-i-need-to-learn-directx-the-examples-for-introduction-to-3d-programming-with-directx-11-by-frank-d-luna-does-not-work-can-anyone-help-me/?do=findComment&comment=5363013
  15. Hello! I would like to introduce Diligent Engine, a project that I've been recently working on. Diligent Engine is a light-weight cross-platform abstraction layer between the application and the platform-specific graphics API. Its main goal is to take advantages of the next-generation APIs such as Direct3D12 and Vulkan, but at the same time provide support for older platforms via Direct3D11, OpenGL and OpenGLES. Diligent Engine exposes common front-end for all supported platforms and provides interoperability with underlying native API. It also supports integration with Unity and is designed to be used as a graphics subsystem in a standalone game engine, Unity native plugin or any other 3D application. It is distributed under Apache 2.0 license and is free to use. Full source code is available for download on GitHub. The engine contains shader source code converter that allows shaders authored in HLSL to be translated to GLSL. The engine currently supports Direct3D11, Direct3D12, and OpenGL/GLES on Win32, Universal Windows and Android platforms. API Basics Initialization The engine can perform initialization of the API or attach to already existing D3D11/D3D12 device or OpenGL/GLES context. For instance, the following code shows how the engine can be initialized in D3D12 mode: #include "RenderDeviceFactoryD3D12.h" using namespace Diligent; // ... GetEngineFactoryD3D12Type GetEngineFactoryD3D12 = nullptr; // Load the dll and import GetEngineFactoryD3D12() function LoadGraphicsEngineD3D12(GetEngineFactoryD3D12); auto *pFactoryD3D11 = GetEngineFactoryD3D12(); EngineD3D12Attribs EngD3D12Attribs; EngD3D12Attribs.CPUDescriptorHeapAllocationSize[0] = 1024; EngD3D12Attribs.CPUDescriptorHeapAllocationSize[1] = 32; EngD3D12Attribs.CPUDescriptorHeapAllocationSize[2] = 16; EngD3D12Attribs.CPUDescriptorHeapAllocationSize[3] = 16; EngD3D12Attribs.NumCommandsToFlushCmdList = 64; RefCntAutoPtr<IRenderDevice> pRenderDevice; RefCntAutoPtr<IDeviceContext> pImmediateContext; SwapChainDesc SwapChainDesc; RefCntAutoPtr<ISwapChain> pSwapChain; pFactoryD3D11->CreateDeviceAndContextsD3D12( EngD3D12Attribs, &pRenderDevice, &pImmediateContext, 0 ); pFactoryD3D11->CreateSwapChainD3D12( pRenderDevice, pImmediateContext, SwapChainDesc, hWnd, &pSwapChain ); Creating Resources Device resources are created by the render device. The two main resource types are buffers, which represent linear memory, and textures, which use memory layouts optimized for fast filtering. To create a buffer, you need to populate BufferDesc structure and call IRenderDevice::CreateBuffer(). The following code creates a uniform (constant) buffer: BufferDesc BuffDesc; BufferDesc.Name = "Uniform buffer"; BuffDesc.BindFlags = BIND_UNIFORM_BUFFER; BuffDesc.Usage = USAGE_DYNAMIC; BuffDesc.uiSizeInBytes = sizeof(ShaderConstants); BuffDesc.CPUAccessFlags = CPU_ACCESS_WRITE; m_pDevice->CreateBuffer( BuffDesc, BufferData(), &m_pConstantBuffer ); Similar, to create a texture, populate TextureDesc structure and call IRenderDevice::CreateTexture() as in the following example: TextureDesc TexDesc; TexDesc.Name = "My texture 2D"; TexDesc.Type = TEXTURE_TYPE_2D; TexDesc.Width = 1024; TexDesc.Height = 1024; TexDesc.Format = TEX_FORMAT_RGBA8_UNORM; TexDesc.Usage = USAGE_DEFAULT; TexDesc.BindFlags = BIND_SHADER_RESOURCE | BIND_RENDER_TARGET | BIND_UNORDERED_ACCESS; TexDesc.Name = "Sample 2D Texture"; m_pRenderDevice->CreateTexture( TexDesc, TextureData(), &m_pTestTex ); Initializing Pipeline State Diligent Engine follows Direct3D12 style to configure the graphics/compute pipeline. One big Pipelines State Object (PSO) encompasses all required states (all shader stages, input layout description, depth stencil, rasterizer and blend state descriptions etc.) Creating Shaders To create a shader, populate ShaderCreationAttribs structure. An important member is ShaderCreationAttribs::SourceLanguage. The following are valid values for this member: SHADER_SOURCE_LANGUAGE_DEFAULT - The shader source format matches the underlying graphics API: HLSL for D3D11 or D3D12 mode, and GLSL for OpenGL and OpenGLES modes. SHADER_SOURCE_LANGUAGE_HLSL - The shader source is in HLSL. For OpenGL and OpenGLES modes, the source code will be converted to GLSL. See shader converter for details. SHADER_SOURCE_LANGUAGE_GLSL - The shader source is in GLSL. There is currently no GLSL to HLSL converter. To allow grouping of resources based on the frequency of expected change, Diligent Engine introduces classification of shader variables: Static variables (SHADER_VARIABLE_TYPE_STATIC) are variables that are expected to be set only once. They may not be changed once a resource is bound to the variable. Such variables are intended to hold global constants such as camera attributes or global light attributes constant buffers. Mutable variables (SHADER_VARIABLE_TYPE_MUTABLE) define resources that are expected to change on a per-material frequency. Examples may include diffuse textures, normal maps etc. Dynamic variables (SHADER_VARIABLE_TYPE_DYNAMIC) are expected to change frequently and randomly. This post describes the resource binding model in Diligent Engine. The following is an example of shader initialization: ShaderCreationAttribs Attrs; Attrs.Desc.Name = "MyPixelShader"; Attrs.FilePath = "MyShaderFile.fx"; Attrs.SearchDirectories = "shaders;shaders\\inc;"; Attrs.EntryPoint = "MyPixelShader"; Attrs.Desc.ShaderType = SHADER_TYPE_PIXEL; Attrs.SourceLanguage = SHADER_SOURCE_LANGUAGE_HLSL; BasicShaderSourceStreamFactory BasicSSSFactory(Attrs.SearchDirectories); Attrs.pShaderSourceStreamFactory = &BasicSSSFactory; ShaderVariableDesc ShaderVars[] = { {"g_StaticTexture", SHADER_VARIABLE_TYPE_STATIC}, {"g_MutableTexture", SHADER_VARIABLE_TYPE_MUTABLE}, {"g_DynamicTexture", SHADER_VARIABLE_TYPE_DYNAMIC} }; Attrs.Desc.VariableDesc = ShaderVars; Attrs.Desc.NumVariables = _countof(ShaderVars); Attrs.Desc.DefaultVariableType = SHADER_VARIABLE_TYPE_STATIC; StaticSamplerDesc StaticSampler; StaticSampler.Desc.MinFilter = FILTER_TYPE_LINEAR; StaticSampler.Desc.MagFilter = FILTER_TYPE_LINEAR; StaticSampler.Desc.MipFilter = FILTER_TYPE_LINEAR; StaticSampler.TextureName = "g_MutableTexture"; Attrs.Desc.NumStaticSamplers = 1; Attrs.Desc.StaticSamplers = &StaticSampler; ShaderMacroHelper Macros; Macros.AddShaderMacro("USE_SHADOWS", 1); Macros.AddShaderMacro("NUM_SHADOW_SAMPLES", 4); Macros.Finalize(); Attrs.Macros = Macros; RefCntAutoPtr<IShader> pShader; m_pDevice->CreateShader( Attrs, &pShader ); Creating the Pipeline State Object To create a pipeline state object, define instance of PipelineStateDesc structure. The structure defines the pipeline specifics such as if the pipeline is a compute pipeline, number and format of render targets as well as depth-stencil format: // This is a graphics pipeline PSODesc.IsComputePipeline = false; PSODesc.GraphicsPipeline.NumRenderTargets = 1; PSODesc.GraphicsPipeline.RTVFormats[0] = TEX_FORMAT_RGBA8_UNORM_SRGB; PSODesc.GraphicsPipeline.DSVFormat = TEX_FORMAT_D32_FLOAT; The structure also defines depth-stencil, rasterizer, blend state, input layout and other parameters. For instance, rasterizer state can be defined as in the code snippet below: // Init rasterizer state RasterizerStateDesc &RasterizerDesc = PSODesc.GraphicsPipeline.RasterizerDesc; RasterizerDesc.FillMode = FILL_MODE_SOLID; RasterizerDesc.CullMode = CULL_MODE_NONE; RasterizerDesc.FrontCounterClockwise = True; RasterizerDesc.ScissorEnable = True; //RSDesc.MultisampleEnable = false; // do not allow msaa (fonts would be degraded) RasterizerDesc.AntialiasedLineEnable = False; When all fields are populated, call IRenderDevice::CreatePipelineState() to create the PSO: m_pDev->CreatePipelineState(PSODesc, &m_pPSO); Binding Shader Resources Shader resource binding in Diligent Engine is based on grouping variables in 3 different groups (static, mutable and dynamic). Static variables are variables that are expected to be set only once. They may not be changed once a resource is bound to the variable. Such variables are intended to hold global constants such as camera attributes or global light attributes constant buffers. They are bound directly to the shader object: PixelShader->GetShaderVariable( "g_tex2DShadowMap" )->Set( pShadowMapSRV ); Mutable and dynamic variables are bound via a new object called Shader Resource Binding (SRB), which is created by the pipeline state: m_pPSO->CreateShaderResourceBinding(&m_pSRB); Dynamic and mutable resources are then bound through SRB object: m_pSRB->GetVariable(SHADER_TYPE_VERTEX, "tex2DDiffuse")->Set(pDiffuseTexSRV); m_pSRB->GetVariable(SHADER_TYPE_VERTEX, "cbRandomAttribs")->Set(pRandomAttrsCB); The difference between mutable and dynamic resources is that mutable ones can only be set once for every instance of a shader resource binding. Dynamic resources can be set multiple times. It is important to properly set the variable type as this may affect performance. Static variables are generally most efficient, followed by mutable. Dynamic variables are most expensive from performance point of view. This post explains shader resource binding in more details. Setting the Pipeline State and Invoking Draw Command Before any draw command can be invoked, all required vertex and index buffers as well as the pipeline state should be bound to the device context: // Clear render target const float zero[4] = {0, 0, 0, 0}; m_pContext->ClearRenderTarget(nullptr, zero); // Set vertex and index buffers IBuffer *buffer[] = {m_pVertexBuffer}; Uint32 offsets[] = {0}; Uint32 strides[] = {sizeof(MyVertex)}; m_pContext->SetVertexBuffers(0, 1, buffer, strides, offsets, SET_VERTEX_BUFFERS_FLAG_RESET); m_pContext->SetIndexBuffer(m_pIndexBuffer, 0); m_pContext->SetPipelineState(m_pPSO); Also, all shader resources must be committed to the device context: m_pContext->CommitShaderResources(m_pSRB, COMMIT_SHADER_RESOURCES_FLAG_TRANSITION_RESOURCES); When all required states and resources are bound, IDeviceContext::Draw() can be used to execute draw command or IDeviceContext::DispatchCompute() can be used to execute compute command. Note that for a draw command, graphics pipeline must be bound, and for dispatch command, compute pipeline must be bound. Draw() takes DrawAttribs structure as an argument. The structure members define all attributes required to perform the command (primitive topology, number of vertices or indices, if draw call is indexed or not, if draw call is instanced or not, if draw call is indirect or not, etc.). For example: DrawAttribs attrs; attrs.IsIndexed = true; attrs.IndexType = VT_UINT16; attrs.NumIndices = 36; attrs.Topology = PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; pContext->Draw(attrs); Build Instructions Please visit this page for detailed build instructions. Samples The engine contains two graphics samples that demonstrate how the API can be used. AntTweakBar sample demonstrates how to use AntTweakBar library to create simple user interface. It can also be thought of as Diligent Engine’s “Hello World” example. Atmospheric scattering sample is a more advanced one. It demonstrates how Diligent Engine can be used to implement various rendering tasks: loading textures from files, using complex shaders, rendering to textures, using compute shaders and unordered access views, etc. The engine also includes Asteroids performance benchmark based on this demo developed by Intel. It renders 50,000 unique textured asteroids and lets compare performance of D3D11 and D3D12 implementations. Every asteroid is a combination of one of 1000 unique meshes and one of 10 unique textures. Integration with Unity Diligent Engine supports integration with Unity through Unity low-level native plugin interface. The engine relies on Native API Interoperability to attach to the graphics API initialized by Unity. After Diligent Engine device and context are created, they can be used us usual to create resources and issue rendering commands. GhostCubePlugin shows an example how Diligent Engine can be used to render a ghost cube only visible as a reflection in a mirror.
  16. Hello, until now i am using structured buffers in my vertexShader to calculate the morph offsets of my animated characters. And it works fine. But until now i only read from this kind of buffers. ( i use 4 of them ) Now i had in mind to do other things, where i have to use a readwrite buffer that i can write to. But i cant get in my head how to sync write acceses. when i read a value from the buffer at a adress that coresponds to e.g. a pixel coordinate and want to add a value another thread could have read the same value overrides the value that i had written. How is this done typically ?
  17. Hey, anyone know a good plugin for debugging hlsl shaders? I have hlsl tool installed, and it makes the text change color, but it wont show me any of teh errors so I have no way of spotting errors in the shader I write other than my own two eyes. It doesnt do the red squiggly underline for some reason. Settings issue or do I need another plugin? Thanks.
  18. Hi, there's a great tutorial on frustum culling, but impossible to compile because it uses old DirectX 11 types (Direct3DXPlane instead of XMVECTOR, etc). Can someone please help me update this one class - frustumClass - to the new DirectX11 types (XMMATRIX, XMVECTOR, etc)? http://www.rastertek.com/dx11tut16.html Furthermore, can anyone please explain how he gets the minimum Z distance from the projection matrix by dividing one element by another? He leaves no explanation for this math and it's extremely frustrating. // Calculate the minimum Z distance in the frustum. zMinimum = -projectionMatrix._43 / projectionMatrix._33; r = screenDepth / (screenDepth - zMinimum); projectionMatrix._33 = r; projectionMatrix._43 = -r * zMinimum; Also not sure how to use an XMVECTOR instead of the old Plane class that he uses. Confused as to where all the m12, m13, etc correspond to the elements of an XMVECTOR. I thought you're not supposed to even access the elements of an XMVECTOR directly! So incredibly frustrating. Please help, thanks.
  19. Hi, As part of my terrain project, I'm trying to render ocean water. I have a nice FFT Compute shader implementation which outputs a nice 512x512 heightmap (It can also output a Gradient map but I disabled it as there are issues with it). The FFT code is from the Nvidia FFT ocean sample for DX11. Now, here is the weird thing, I have 2 different methods that render the water grid, both using the same FFT Heightmap SRV (SRVs are members of a dedicated Resource Manager class), and both are rendering the FFT Heightmap same way exactly. Although the grids are different, eventually I made the FFT map tile in a way where the scales are almost 1:1. The rendering itself is pretty much straight forward (Using DX11 Tessellation pipeline): 1. In domain shader - Sample the Heightmap in order to displace the vertices 2. In pixel shader - Finite diff to get the normals - Sample the heightmap 4 times and calculate the normals as usual Now here is the weird thing: Method 1 - Normals look good after Finite diff operation - Unfortunately I can't use this method as it has some other issues. Method 2 - Normals are coming out distorted in a way that I can't explain - More than that, if in the Domain shader I give up the displacement on the horizontal axis (XZ) and leave only the vertical displacement on Y axis, the normals are fine. With full displacement (XZ included) it feels like the normals aren't compensating for the XZ movement of the displacement. I tried to play with anything I could think of, but normals look bad no matter what. And I really don't want to give up the XZ displacement as with vertical displacement only, the FFT looks kinda crippled. I tried also to use ddx_fine and ddy_fine, and it seems like the normals looking more accurate (i.e taking the XZ movement into account), but the quality was very low, so not usable. But the fact that the natural derivatives functions showed the XZ movement more accurately does give me hope that there is a better way to do it (?) So, Is there a better way to calculate the normals more accurately? Here is the difference: Method 1 normals - Nice and crispy Method 2 normals - Distorted Also here is the Method 2 displacement in wireframe, and it's looking good as can be seen: I'm also attaching here the relevant DS and PS code that makes the displacement and normals in method 2 (Method 1 code is same, just has some more stuff like Perlin noise blended in the distance, but the FFT related stuff is same exactly): DS displacement code // bilerp the position float3 worldPos = Bilerp(terrainQuad[0].vPosition, terrainQuad[1].vPosition, terrainQuad[2].vPosition, terrainQuad[3].vPosition, UV); float3 displacement = 0; displacement = SampleHeightForVS(gFFTHeightMap, Sampler16Aniso, worldPos.xz); displacement.z *= -1; // Flip Z back because the tex coordinates use a flipped Z, if not flipping the FFT look kinda upside down worldPos += displacement * FFT_DS_SCALE_FACTOR; return worldPos; PS finite diff: float3 CalcNormalForOceanHeightMap(float2 uv) { float2 one_texel = float2(1.0f / 512.0f, 1.0f / 512.0f); float2 leftTex; float2 rightTex; float2 bottomTex; float2 topTex; float leftY; float rightY; float bottomY; float topY; float normFactor = 1.0 / 512.0; leftTex = uv + float2(-one_texel.x, 0.0f); rightTex = uv + float2(one_texel.x, 0.0f); bottomTex = uv + float2(0.0f, one_texel.y); topTex = uv + float2(0.0f, -one_texel.y); leftY = gFFTHeightMap.SampleLevel(Sampler16Aniso, leftTex, 0 ).z * normFactor; rightY = gFFTHeightMap.SampleLevel(Sampler16Aniso, rightTex, 0 ).z * normFactor; bottomY = gFFTHeightMap.SampleLevel(Sampler16Aniso, bottomTex, 0 ).z * normFactor; topY = gFFTHeightMap.SampleLevel(Sampler16Aniso, topTex, 0 ).z * normFactor; float3 normal; normal.x = (leftY - rightY); normal.z = (bottomY - topY); normal.y = 1.0f / 64.0; return normalize(normal); } Any help would be welcome, thanx!
  20. Hello, i want to use - fxc Effect compiler to compile my Fx Files with, Techniques, Passes, Shaders to ByteCode => no Problem - SharpDX ShaderReflection to get attributes of every used Shader to generate Constant Buffer Structures an other things - to use ShaderReflection i have to feed the ShaderByteCode of every single Shader i want to operate with. I was succesfull to iterate through the Effect for the Techniques, an iterate through the Techniques for the Passes an the Descriptions I struggled a lot but cannot find out where is the connection to get the ShaderByteCode of the Shaders that are used by a EffectPass ?? Here is the anser of AlexandreMutel And even then, you can still use the FX file format - the difference is you'd have an offline compiler tool that parses the format for techniq ues/passes in order to acquire the necessary shader profiles + entry points. So with the profile/entry point in hand, you can run the FX file through the D3DCompiler to get a ShaderByteCode object for each shader, then use that to create a reflection object to query all your meta data. Then write out the reflected meta data to a file, that gets consumed by your application at runtime - which would be your own implementation of Effects11 (or something completely different, either way...you use the meta data to automatically setup your constant buffers, bind resources, and manage the shader pipeline by directly using the Direct3D11 shader interfaces). How do i get the pofile/entry point at hand ?? Please give me some help
  21. In countless sources I've found that, when operating within a warp, one might skip syncthreads because all instructions are synchronous within a single warp. In CUDA-related sources. I followed that advice and applied it in DirectCompute (I use NV's GPU). I wrote this code that does nothing else but good old prefix-sum of 64 elements (64 is the size of my block): groupshared float errs1_shared[64]; groupshared float errs2_shared[64]; groupshared float errs4_shared[64]; groupshared float errs8_shared[64]; groupshared float errs16_shared[64]; groupshared float errs32_shared[64]; groupshared float errs64_shared[64]; void CalculateErrs(uint threadIdx) { if (threadIdx < 32) errs2_shared[threadIdx] = errs1_shared[2*threadIdx] + errs1_shared[2*threadIdx + 1]; if (threadIdx < 16) errs4_shared[threadIdx] = errs2_shared[2*threadIdx] + errs2_shared[2*threadIdx + 1]; if (threadIdx < 8) errs8_shared[threadIdx] = errs4_shared[2*threadIdx] + errs4_shared[2*threadIdx + 1]; if (threadIdx < 4) errs16_shared[threadIdx] = errs8_shared[2*threadIdx] + errs8_shared[2*threadIdx + 1]; if (threadIdx < 2) errs32_shared[threadIdx] = errs16_shared[2*threadIdx] + errs16_shared[2*threadIdx + 1]; if (threadIdx < 1) errs64_shared[threadIdx] = errs32_shared[2*threadIdx] + errs32_shared[2*threadIdx + 1]; } This works flawlessly. I noticed that I have bank conflicts in here so I changed that code to this: void CalculateErrs(uint threadIdx) { if (threadIdx < 32) errs2_shared[threadIdx] = errs1_shared[threadIdx] + errs1_shared[threadIdx + 32]; if (threadIdx < 16) errs4_shared[threadIdx] = errs2_shared[threadIdx] + errs2_shared[threadIdx + 16]; if (threadIdx < 8) errs8_shared[threadIdx] = errs4_shared[threadIdx] + errs4_shared[threadIdx + 8]; if (threadIdx < 4) errs16_shared[threadIdx] = errs8_shared[threadIdx] + errs8_shared[threadIdx + 4]; if (threadIdx < 2) errs32_shared[threadIdx] = errs16_shared[threadIdx] + errs16_shared[threadIdx + 2]; if (threadIdx < 1) errs64_shared[threadIdx] = errs32_shared[threadIdx] + errs32_shared[threadIdx + 1]; } And to my surprise this one causes race conditions. Is it because I should not rely on that functionality (auto-sync within warp) when working with DirectCompute instead of CUDA? Because that hurts my performance by measurable margin. With bank conflicts (first version) I am still faster by around 15-20% than in the second version, which is conflict-free but I have to add GroupMemoryBarrierWithGroupSync in between each assignment.
  22. So I have potentially a fairly simple question When rendering various things, whether they are 2D or 3D, when do they get placed or blitted onto the set render target? Is it at the moment when you call Present or is it at the moment of calling a Draw call?
  23. As part of a video project I'm working on, I have to pass ID3D11Texture2D decoded by CUDA, from one D3D11Device to the other, which handles rendering. I managed to achieve the goal, but it looks like I'm leaking textures. The workflow looks as follows: Sending side (decoder): ID3D11Texture2D* pD3D11OutTexture; if (!createOutputTexture(pD3D11OutTexture)) return false; IDXGIResource1* pRsrc = nullptr; pD3D11OutTexture->QueryInterface(__uuidof(IDXGIResource1), reinterpret_cast<void**>(&pRsrc)); auto hr = pRsrc->CreateSharedHandle( nullptr, DXGI_SHARED_RESOURCE_READ | DXGI_SHARED_RESOURCE_WRITE, nullptr, &frameData->shared_handle); pRsrc->Release(); Receiving side (renderer): ID3D11Texture2D* pTex = nullptr; hres = m_pD3D11Device->OpenSharedResource1( frameData->shared_handle, __uuidof(ID3D11Texture2D), reinterpret_cast<void**>(&pTex)); DrawFrame(pTex); pTex->Release(); CloseHandle(frameData->shared_handle); I'm somewhat puzzled by the inner workings of this workflow, namely: what happens when I create a shared handle? Does this allow me to release texture? what happens, when I call OpenSharedResource? Does it create separate texture - that is do I have to release both textures after rendering? Appreciate your help!
  24. I converted some code to use tiled resources and noticed a few issues that I was wondering if anyone else knows anything about. First of all, the tiled resource is working and visually everything is correct-- but.. 1. It takes a long time to create the tiled resource. The call to CreateTexture2D to create the tiles(D3D11_RESOURCE_MISC_TILED) often doesn't return for a few seconds. There aren't any warnings or messages from the debug layer either, so this seems very strange. My resource is a texture2d array with 453 entries, each 2d slice is 2048x2048. I tried reducing the size to 256x256 instead and it still takes at least 2 seconds to complete the function call. 2. I set the lowest mip layer(128x128) in each slice to a unique mapping, and all of the rest of the tiles to a single default tile, I'd expect performance to be fairly good since this means that aside from the lowest mip, there is only 65k of texture memory being accessed, but it destroys my FPS-- going from 80 to 20. Visually it is correct, the tiles show as expected-- This is on a tier1 device and apparently mip maps under the tile size(128x128 in my case) do not work with texture arrays. So in the shader I check the distance and only use the tiled resource when near the camera, this helps but performance is still rubbish. GPU is an AMD 285x
  25. Hello, can someone tell me what's the easiest way to draw string with D3D11? I can't use both FW1FontWrapper and ImGui, so I need an alternative. The only goal I want to get is to draw a text at a screen location, using Tahoma font (if possible, with outline style). Thanks in advance.