• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.


  • Content count

  • Joined

  • Last visited

Community Reputation

928 Good

About galop1n

  • Rank
  1. DX11

    With D3D11, they are always serialized by default by the drivers. You can use vendor APIs to disable the behavior locally if you know what you are doing. For nvidia : NvAPI_D3D11_BeginUAVOverlap For AMD : IAmdDxExtUAVOverlap::BeginUAVOverlap
  2. There is many version of d3dcompiler_47.dll, a very dumb idea… If your shader just compile from visual studio as a hlsl source file, the fxc and dll you use is probably bound to the windows sdk that is setup in your project.   Not saying that getting the latest one would solve this, but you may still run an outdated compiler :)
  3. You can use the header directxmath_packedvector.h, part of directxmath to do the conversions. Sixteen bits float on gpu are well documented, you can also write your own conversion. It is one bit sign, 5 exponent and 10 mantissa.
  4. Something to note, the uav way open optimisation as you can produce severalemips in one pass, reducing load bandwidth for example.
  5. You have to unbind from uav to srv because the driver may ( have to ) flush caches and set barriers to make sure your dispatch happen after everything from the draws is done and available to it.
  6.  No IIRC one of the MS DX12 videos on youtube mentions explictly that if you want to do dynamic indexing with instancing you need to do something 'special'.  I don't remember what exactly it is but I remember them saying if the index to the texture change within a single 'drawcall' then you have to do something for it to work correctly.   Ok so yes, this is the NonUniformResourceIndex :)
  7. I think you are confusing with sv_instanceId that is a system semantic. The bindless intrinsic is used like that Texture2D<float4> diffuse_[] : register(t1); uint someIndex = foo(); // coming from something like an interpolator or whatever Texture2D diffuse = diffuse_[NonUniformResourceIndex(someIndex)]; //then using diffuse Without the intrinsic, that is more like a tag, as pixels are gather into waves, some hardware ( AMD obviously ) will have bogus result. because a texture descriptor is loaded in scalar registers and if someIndex is divergent, it means that you are crossing flux.   The NonUniformResourceIndex will inform the compiler and driver of that and the driver will generate a loop with clever masking to process group of threads per value of someIndex. In simple cases, the overhead may not be significant, but if you multiply the divergent indices, then it will for sure bloater your shader :)
  8. Excluding tiers 1 resource binding is an easy call, only a few old generation Intel integrated gpu are tiers 1 :) Yes, tiers 2 can dynamically index textures and buffers, a technique known as bindless.   The non uniform intrinsic is useful on AMD, and does nothing on nVidia, BUT, you do not want to use it as it can generate very ugly and not efficient shaders, it is better on AMD to sends draws in a way you can "uniformize" the index. Shader model 6 will also help in that regards with an explicit way to read first lane
  9. Yes, i was unclear on the amount of frame latency in my message, the actual flip happen when the dashed bar disappear. You also said i have only 2 buffers, but i create my swapchain with 3, and it is visible (A0,A1,A2 i believe reference to each buffer). But most importantly, what you said also raise a concern. We are very particular on latency for the game i work on at my company, and this d3d12 extra frame latency without a real exclusive fullscreen will be an issue. Is this something that is plan to be improve in the next 18 months ?
  10. The line you removed was just replacing the allocated pointer with an anonymous local array. You definitely do not understand pointers, memory life time and ownership. I would advice you to use standard container like vector and to have a zero new policy ( make_unique is fine ), and no raw pointers, use comptr for dx objects. You will save you time, performance, simplify design, reduce complexity, and save baby seals  :)
  11. As for gpu view, this is a capture of my sandbox with waitable swap chain ( triple buffer ) and max latency at one, you can see that the frame start on the cpu right after a vblank and is fully processed before the next one then just sit here waiting the next vblank to be diplayed : [sharedmedia=core:attachments:35851]     And this is with max latency to two, you can observe the extra frame delay :   [sharedmedia=core:attachments:35852]     For completness, i put a non waitable gpu view capture, you can see then that many frames now are queued and if you look closely the highlight, it now takes many frames ( 4 ) before a queued frame reach execution and presentation : [sharedmedia=core:attachments:35863]
  12. Hello Jesse,   I have a little question for you in regards to the waitable swapchain. It used to be windowed only, but at some point, a windows 10 update made it work for fullscreen too. Could you retrieve the exact version that did the change ?
  13. You can use gpu view to observe queing in the driver. Setmaxframelatency only work with a waitable swap chain, if you used the debug layer, it would have warn. Usually, fullscreen is better than windowed for flipping. Sometime, render is not the problem, keep an eye on when you update inputs for example
  14. GPU are beast made to deal with latency, and you will not achieve the bandwidth peak unless you wrote a very edge case just for it that is not applicable. The truth is that the little difference here will never show up because some other parts of the graphic pipeline will take longer and you will never observe a scenario where the GPU is waiting for a index read.   This is also the same reason this days we use triangle list and not triangle strip unless you need to save memory on a low memory platform ( phone or 3DS ? ). We can do a better job with triangles in regards to the post transform cache than with strips. There is no pre vertex cache his days either for that matters too.
  15. This is all convention, row major, column major. You are right that mul(vec, mat) == mul(traspose(mat), vec). The difference is how the shader do the math, either with dot products or mul and mads. It used to be better to do the dot products, so the transpose. With our fat GPU that are scalar and not simd, it matters less. But my advice, you have thousands of optimisation to do before that transpose become a problem unless of a very degenerative issue :)