• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
Seabolt

Vulkan
What are your opinions on DX12/Vulkan/Mantle?

121 posts in this topic


On the flip side, I am a bit concerned about sync issues. Sync between CPU and GPU (or even the GPU with itself) can lead to some really awful, hard-to-track down bugs. It's bad because you might think that you're doing it right, but then you make a small tweak to a shader and suddenly you have artifacts. It's hard enough dealing with that for one hardware configuration, so it's a little scary to imagine what could happen for PC games that have to run on everything. Hopefully there will be some good debugging/validation functionality available for tracking this down, otherwise we will probably end up with drivers automatically inserting sync points to prevent corruption (and/or removing unnecessary syncs for better performance). Either way, beginners are probably in for a rough time. 

Don't worry, a variety of shipping professional games will somehow make a complete mess of it in final build too rolleyes.gif

Edited by Promit
2

Share this post


Link to post
Share on other sites

On the flip side, I am a bit concerned about sync issues. Sync between CPU and GPU (or even the GPU with itself) can lead to some really awful, hard-to-track down bugs. It's bad because you might think that you're doing it right, but then you make a small tweak to a shader and suddenly you have artifacts. It's hard enough dealing with that for one hardware configuration, so it's a little scary to imagine what could happen for PC games that have to run on everything. Hopefully there will be some good debugging/validation functionality available for tracking this down, otherwise we will probably end up with drivers automatically inserting sync points to prevent corruption (and/or removing unnecessary syncs for better performance). Either way, beginners are probably in for a rough time. sad.png

 

New debugging tools are coming: https://channel9.msdn.com/Events/GDC/GDC-2015/Solve-the-Tough-Graphics-Problems-with-your-Game-Using-DirectX-Tools

1

Share this post


Link to post
Share on other sites

Slightly off-topic, but I'm starting a new engine before Vulkan or D3D12 are released. Any pointers on how I can prepare my rendering pipeline architecture so that when they are released, I can use them efficiently? I'm planning to start with D3D11 and OpenGL 4.5.

0

Share this post


Link to post
Share on other sites

If you sign-up in the DX12 EAP you can access toe source code of the UE4 DX12 implementation.

 

It isn't 'signing up'. It's applying. You have to be approved (I've yet to be approved, sadly).

0

Share this post


Link to post
Share on other sites

 

If you sign-up in the DX12 EAP you can access toe source code of the UE4 DX12 implementation.

 

It isn't 'signing up'. It's applying. You have to be approved (I've yet to be approved, sadly).

 

 

Try to "ask" access another time, it worked for me happy.png Anyway I have to recognize that the approbation process could be improved a lot.

Edited by Alessio1989
0

Share this post


Link to post
Share on other sites

 

 

If you sign-up in the DX12 EAP you can access toe source code of the UE4 DX12 implementation.

 

It isn't 'signing up'. It's applying. You have to be approved (I've yet to be approved, sadly).

 

 

Try to "ask" access another time, it worked for me happy.png Anyway I have to recognize that the approbation process could be improved a lot.

 

 

I have no idea what you mean by "try to 'ask' access another time".

0

Share this post


Link to post
Share on other sites

I've refrained from replying to this for a few days while I've been letting the information that's recently come out, and the implications of it, bounce around my head for a bit, but feel roundabout ready to do so now.

 

I'm really looking forward to programming in this style.

 

I'm aware and accept that there's going to be a substantial upfront investment required, but I think the payoff is going to be worth it.

 

I think a lot of code is going to get much cleaner as a result of all this.  A lot of really gross batching and state management/filtering code is just going to go away.  Things are going to get a lot simpler; once we tackle the challenge of managing (and being responsible for) GPU resources at a lower level, which I think is something that we're largely going to write once and then reuse across multiple projects, programming graphics is going to start being fun again.

 

I think it's going to start becoming a little like the old days of OpenGL; not quite at the level where you could just issue a glBegin/glEnd pair and start experimenting and seeing what kind of cool stuff you could do, but it will become a lot easier to just drop in new code without having to fret excessively about draw call counts, batching, state management, driver overhead, and "is this effect slow because it's slow, or is it slow because I've hit a slow path in the driver and I need to go back and rearchitect?"  That's really going to open up a lot of possibilities for people to start going nuts.

 

I think that the people who are going to have the hardest time of it are those who have the heaviest investment in what's become a traditional API usage over the past few years: lots of batching and instancing, in other words.  I have one project, using D3D11, that I think I would probably have to rewrite from scratch (I probably won't bother).  On the other hand, I have another, using a FrankenGL version, that I think will come over quite a bit more cleanly.  That's going to be quite cool and fun to do.

 

So unless I've got things badly wrong about all of this, I'm really stoked about the prospects.

1

Share this post


Link to post
Share on other sites

I will not go into explicit details (detailed information should be still under NDA), however the second feature level looks tailor-made for a certain particular hardware (guess what!). Moreover FL 12.1 do not requires some really interesting features (greater conservative rasterization tier, volume tiled resources and even resource binding tier 3) that you could expected to be mandatory supported by future hardware. In substance FL12.1 really brake the concept of feature level in my view, which was a sort of "barrier" that defined new hardware capabilities for upcoming hardware.

So you have feature level 12.0 for mainstream hardware, older feature levels for old/low-end hardware, and 12.1 for "a certain particular hardware" and most foreseeable future hardware. How is this a problem? Clearly, if 12.1 is so similar to 12.0, 12.0 is the main target and you won't be writing much special case code for 12.1.

Edited by LorenzoGatti
0

Share this post


Link to post
Share on other sites

 

 

 

If you sign-up in the DX12 EAP you can access toe source code of the UE4 DX12 implementation.

 

It isn't 'signing up'. It's applying. You have to be approved (I've yet to be approved, sadly).

 

 

Try to "ask" access another time, it worked for me happy.png Anyway I have to recognize that the approbation process could be improved a lot.

 

 

I have no idea what you mean by "try to 'ask' access another time".

 

 

Try to compile twice the form: http://aka.ms/dxeap

 

 

I will not go into explicit details (detailed information should be still under NDA), however the second feature level looks tailor-made for a certain particular hardware (guess what!). Moreover FL 12.1 do not requires some really interesting features (greater conservative rasterization tier, volume tiled resources and even resource binding tier 3) that you could expected to be mandatory supported by future hardware. In substance FL12.1 really brake the concept of feature level in my view, which was a sort of "barrier" that defined new hardware capabilities for upcoming hardware.

So you have feature level 12.0 for mainstream hardware, older feature levels for old/low-end hardware, and 12.1 for "a certain particular hardware" and most foreseeable future hardware. How is this a problem? Clearly, if 12.1 is so similar to 12.0, 12.0 is the main target and you won't be writing much special case code for 12.1.

 

 

It's not "a problem" per sé, I'm just saying I was expected to see a feature level for future hardware with more interesting and radical requirements that could have been FL 12.1 (eg: mandatory support for 3D tiled resouces,  higher tier of conservative rasterization and standard swizzle, tier 3 resource binding.. and what the hell, even PS stencil ref is still optional). FL 12.0 and 12.1 are quite identical except or ROVs (probably the most valuable requirement of FL12.1) and conservative rasterization tier 1 (which is useless but for anti-aliasing).

I'm not saying anything else. With D3D12 you can still target every feature level you want (even 10Level9s) and query for every single new feature hardware feature (e.g.: you can use ROVs on a FL 11.0 GPU if it is supported by the hardware/driver).

Edited by Alessio1989
0

Share this post


Link to post
Share on other sites

 

 

 

 

If you sign-up in the DX12 EAP you can access toe source code of the UE4 DX12 implementation.

 

It isn't 'signing up'. It's applying. You have to be approved (I've yet to be approved, sadly).

 

 

Try to "ask" access another time, it worked for me happy.png Anyway I have to recognize that the approbation process could be improved a lot.

 

 

I have no idea what you mean by "try to 'ask' access another time".

 

 

Try to compile twice the form: http://aka.ms/dxeap

 

I've submitted the form at least three times. At this point, I've given up.

0

Share this post


Link to post
Share on other sites

- Memory residency management. The presenters were talking along the lines of the developers being responsible for loading/unloading graphics resources from VRAM to System Memory whenever the loads are getting too high. This should be an edge case but it's still an entirely new engine feature.

Yeah it's going to be interesting to see what solutions different engines end up using here.
The simplest thing I can think of is to maintain a Set<Resource*> alongside every command buffer. Whenever you bind a resource, add it to the set. When submitting the command buffer, you can first use that set to notify windows of the VRAM regions that are required to be resident.

The fail case there is when that residency request is too big... As you're building the command buffer, you'd have to keep track of an estimate of the VRAM residency requirement, and if it gets too big, finish the current command buffer and start a new one.


- Secondary threads for resource loading/shader compilation. This is actually a really good thing that I'm excited for, but it does mean I need to change my render thread to start issuing new jobs and maintaining. It's necessary, and for the better good, but another task nonetheless.

If you're using D3D11, you can start working on it now.
If you're on GL, you can start doing it for buffers/textures via context resource sharing... But it's potentially a lot of GL-specific code that you're not going to need in your new engine.

- Root Signatures/Shader Constant management
Again really exciting stuff, but seems like a huge potential for issues, not to mention the engine now has to be acutely aware of how frequently the constants are changed and then map them appropriately.

Yeah if you can give frequency hints in your shader code, it might make your life easier.

When compiling a shader, I imagine you'd first try to fit all of its parameters into the root, and then fall back to other strategies if they don't fit.

The simplest strategy is putting everything required for your shader into a single big descriptor set, and having the root just contain the link to that set. I imagine a lot of people might start with something like that to begin with.

I don't have an update-frequency hinting feature, but my shader system does already group texture/buffer bindings together into "ResourceLists".
e.g. A DX11 shader might have material data in slots t0/t1/t2 and a shadowmap in t3. In the shader code, I declare a ResourceList containing the 3 material textures, and a 2nd ResourceList containing the shadowmap.
The user can't bind individual resources to my shader, they can only bind entire ResourceLists.
I imagine that on D3D12, these ResourceLists can actually just be DescriptorSets, and the root can just point out to them.
So, not describing frequency, but at least describing which bindings are updated together.

I'll also be adding in architecture for Compute Shaders for the first time, so I'm worried that I might be biting off too much at once.

Yeah it's haven't done a robust compute wrapper before either. I'm doing the same stateless job kinda thing as I've already done for graphics so far.
With the next generation APIs, there's a few extra hassles with compute -- after a dispatch, you almost always have to submit a barrier, so that the next draw/dispatch call will stall until the preceding compute shader is actually complete.

Same goes for passes that render to render-target actually. e.g. In a post-processing chain (where each draw reads the results from the previous one) you need barriers after each draw to transition from RT to texture, which had the effect of inserting these necessary stalls.

I think a lot of code is going to get much cleaner as a result of all this. A lot of really gross batching and state management/filtering code is just going to go away.

For simple ports, you might be able to leverage that ugly code :D
In the D3D12 preview from last year, they mentioned that when porting 3DMark, they replaced their traditional state-caching code with a PSO/bundle cache, and still got more than a 2x performance boost over DX11.

I think that the people who are going to have the hardest time of it are those who have the heaviest investment in what's become a traditional API usage over the past few years: lots of batching and instancing, in other words.

Stuff that's designed for traditional batching will probably be very well suited to the new "bundle" API.

I am a bit concerned about sync issues. Sync between CPU and GPU (or even the GPU with itself) can lead to some really awful, hard-to-track down bugs. It's bad because you might think that you're doing it right, but then you make a small tweak to a shader and suddenly you have artifacts.

Here's hoping the debuggers are able to detect sync errors. The whole "transition" concept, which is a bit more abstracted than the reality, should help debuggers here. Even if the debugger can just put its hands up and say "you did *something* non-deterministic in that frame", then at least we'll know our app is busted.
2

Share this post


Link to post
Share on other sites

 One to store per-draw data 
Do you use some form of indexing into the UBO to fetch the data? I'm currently batching UBO updates (say, fit as many transforms, lights or materials as I can on one glBufferSubData call) and do a glUniform1i with an index, then index into the UBO to fetch the correct transform. This has the obvious limitation that I need one draw call per object being drawn to update the index uniform in between, but honestly I'm not sure how else I could do that. And AFAIK its also how its made in a nVidia presentation about batching updates.

 

Good thing is that I can do usually batches of 100 to 200 in one buffe rupdate call, bad thing is that I have equivalent number of draw and glUniform1i calls. Have in mind that I'm using OpenGL 3.3 here so no multi draw indirect stuff :D

 

And BTW, marking Promit's post as "Popular" is the understatement of the year (I never saw that badge before!). Thing hit like all the retweets and 300 comments on Reddit. You could sell Promit as internet traffic attractor if the site is low on cash :P

1

Share this post


Link to post
Share on other sites

I use baseInstance parameter from glDraw*BaseInstanceBaseVertex. gl_InstanceID will still be zero based, but you can use an instanced vertex element to overcome this problem (or use an extension that exposes an extra glsl variable with the value of baseInstance)

2

Share this post


Link to post
Share on other sites

 

 

You should already be doing that on modern D3D11/GL.

 

That's true, and I am ashamed to say I stuck too closely to the DX9 port of my engine where I didn't have nearly as much register space and needed to swap things around on a per-draw basis at times.

 

Scrapping all of that now though and moving forward with DX11 and OGL 4.x and porting in DX12 and Vulkan when they are more public. 

You guys have assuaged most of my fears about the ports though :)

 

0

Share this post


Link to post
Share on other sites

Edit: Said something stupid, sorry about that :)

Edited by AlexPol
0

Share this post


Link to post
Share on other sites

 

- Root Signatures/Shader Constant management
Again really exciting stuff, but seems like a huge potential for issues, not to mention the engine now has to be acutely aware of how frequently the constants are changed and then map them appropriately.

You should already be doing that on modern D3D11/GL.
In Ogre 2.1 we use 4 buffer slots:

  1. One for per-pass data
  2. One to store all materials (up to 273 materials per buffer due to the 64kb per const buffer restriction)
  3. One to store per-draw data
  4. One tbuffer to store per-draw data (similar to 3. but it's a tbuffer which stores more data where not having the 64kb restriction is handy)

Of all those slots, we don't really change them. Even the per-draw parameters.

The only time we need rebind buffers are when:

  1. We've exceeded one of the per-draw buffers size (so we bind a new empty buffer)
  2. We are in a different pass (we need another per-pass buffer)
  3. We have more than 273 materials overall and previous draw referenced material #0 and the current one is referencing material #280 (so we need the switch the material buffer)
  4. We change to a shader that doesn't use these bindings (very rare).

Point 2 happens very infrequently. Point 3 & 4 can be minimized by sorting by state in a RenderQueue. Point 1 happens very infrequently too, and if you're on GCN the 64kb limit gets upgraded to 2GB limit, which means you wouldn't need to switch at all (and also solves point #3 entirely).

The entire bindings don't really change often and this property can already be exploited using DX11 and GL4. DX12/Vulkan just makes the interface thiner; that's all.

 

 

How are you implementing your constant buffers? From what you've written as your #3b, it sounds like you're packing multiple materials'/objects' constants into a single large constant buffer, and perhaps indexing out of it in your draws? IIRC, that's supported only in D3D11.1+, as there is no *SSetConstantBuffer function that takes offsets until then.

 

Otherwise, if you aren't using constant buffers with offsets, how are you avoiding having to set things like object transforms and the like? If you are, how are you handling targets below D3D11.1?

Edited by Ameise
0

Share this post


Link to post
Share on other sites
use baseInstance parameter from glDraw*BaseInstanceBaseVertex. gl_InstanceID will still be zero based, but you can use an instanced vertex element to overcome this problem (or use an extension that exposes an extra glsl variable with the value of baseInstance)

And what if you're drawing two different meshes? ie, not instancing a single mesh.

 

How are you implementing your constant buffers? From what you've written as your #3b, it sounds like you're packing multiple materials'/objects' constants into a single large constant buffer, and perhaps indexing out of it in your draws? IIRC, that's supported only in D3D11.1+, as there is no *SSetConstantBuffer function that takes offsets until then.

I have no idea about D3D11, but prolly isn't even necessary. Just update the entire buffer in one call. Buffer is defined as an array of structs, index into that to fetch the one that corresponds to the current thing being drawn.

Edited by TheChubu
0

Share this post


Link to post
Share on other sites
And what if you're drawing two different meshes? ie, not instancing a single mesh.

 

1 is a valid value for the instance count.

2

Share this post


Link to post
Share on other sites

1 is a valid value for the instance count.
Of course but the idea is to batch up data inside the constant/uniform buffers and use the instance ID for indexing. No sense doing it if you can only index one thing (ie, you end up what I am doing, one glDraw and glUniform1i call per mesh drawn).
0

Share this post


Link to post
Share on other sites

 

Of course but the idea is to batch up data inside the constant/uniform buffers and use the instance ID for indexing. No sense doing it if you can only index one thing (ie, you end up what I am doing, one glDraw and glUniform1i call per mesh drawn).

 

 

Id comes from Instance data if I understand correctly and not gl_InstanceID. Id is different for two different instance, and a different mesh is a different instance.

 

Think of this as 2 buffers, one is instance buffer which contains only ID, the other is vertex buffer.

A first draw call would use 10 instance from the instance buffer, starting from BaseInstance 0.
A second draw call would use 1 instance from the instance buffer, starting from BaseInstance 10.

 

So if in your instance buffer you put Id in ascending order for instance, all the ID will be different.

2

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • By khawk
       
      This video gives an overview of differing features an OpenGL ES developer would encounter when starting to develop with the Vulkan API.
    • By khawk
       
      Vulkan is the latest graphics API from Khronos and it’s designed to make it easier for developers to use the GPU for greater realism in 3D graphics and greater efficiency in compute applications. Qualcomm Developer Network is introducing you to what Vulkan is, what Vulkan is not, and other things you should know about the new Vulkan API and your graphics development.
    • By Green_Baron
      Hello,
      my first post here :-)
      About half a year ago i started with C++ (did a little C before) and poking into graphics programming. Right now i am digging through the various vulkan tutorials.
      A probably naive question that arose is:
      If i have a device (in my case a GTX970 clone) that exposes on each of two gpus two families, one with 16 queues for graphics, compute, etc and another one with a single transfer queue, do i loose potential performance if i only use 1 of the 16 graphics queues ? Or, in other words, are these queues represented by hardware or logical entities ?
      And how is that handled across different vendors ? Do intel and amd handle this similar or would a program have to take care of different handling across different hardware ?
      Cheers
      gb
    • By Ryan_001
      As in the title, does vkGetImageMemoryRequirements() take into account the alignment/size adjustments required for VK_IMAGE_TILING_OPTIMAL images given by bufferImageGranularity?  I'm assuming no, but the spec isn't all that clear.
    • By KarimIO
      Hey guys. Vulkan newbie from OpenGL here. Is there a way to remove a secondary buffer from a primary buffer in Vulkan? If not, how else can I remove objects? I'm binding my primary buffer to:
      Bind RenderPass
      Draw Related Secondary Command Buffers
      End RenderPass
       
      And my secondary buffers to:
      Bind Graphics Pipeline
      Bind Vertex Buffer
      Bind Index Buffer
      Bind Descriptors
      Draw Indexed
       
      The only other way I can guess to have objects be removeable is to have one command buffer per object and add them all to submitInfo.pCommandBuffers, but then I'll be rebinding the renderpass a lot, plus I figure it's more work for the GPU. Also, how would object removal be handled with indirect draws, as I'm planning on looking into that relatively soon?
       
      EDIT: One more question, is there a way to bind one descriptor set (for the projection, view, model, and combination matrices) to multiple graphics pipelines? And is there a way to bind graphics pipelines to multiple secondary command buffers?
  • Popular Now