Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 15 Dec 2001
Offline Last Active Yesterday, 11:51 AM

#5261214 How hard it is to live from games as indie developer?

Posted by phantom on 09 November 2015 - 03:20 PM

heh, make money in the mobile market... heh... good one!

#5253730 Dealing with D3DX____ functions being depricated.

Posted by phantom on 23 September 2015 - 04:41 PM

Thanks, great explanation. I was totally not aware of that sample page. In view of all that I'm going to go straight to 12, I've got a feeling that industry will transition to that much faster then other builds considering the advantages. Thanks, cheers.

I would just like to ask this question; why are you doing what you are doing?

If it is for a project that you plan to finish and release then I would transition to 11 and work with that for a while. D3D11 isn't going away and while it might have performance issues for AAA guys if you are building your own stuff then it might be worth sticking with D3D11.

If your goal is to get a job in the industry then D3D12 makes more sense, and you might as well skip 11 as the APIs aren't remotely compatible.

One of the key things, going from 10 or 11 to 12 is that it will probably require a rethink and a rebuild of many things to get the best from it; trying to treat D3D12 the same as D3D10, data and work wise, isn't going to net you a great deal. Heck, I would make the argument that unless you are multi-threading your whole rendering system the whole way through stick to 11 - you get pretty well tuned drivers which will spread the workload.

D3D12 will shine when you can go wide and when you understand how the hardware works (and thus why things are as they are) - it is more hands on.

That's not to discourage you from moving across of course, just think about why you are doing it - new and shiny isn't the best for every project and you might be served well enough with D3D11 where things are tuned, stable and much better documented.

#5253374 "Xbox One actually has two graphical queues"

Posted by phantom on 21 September 2015 - 06:40 PM

Close, but the layering is a bit more complicated than that.

An ACE is a higher level manager which will split work to compute units; internally the CU can schedule and control up to 40 wave fronts of work (4 x 10 'program counters' if you will) dispatching instructions and switching between work as required - the details are covered in AMD presentations, but basically from each group of 10 program counters it can dispatch up to 4 instructions to the SIMD, scalar, vector memory and scalar memory and program flow control units, which is the 'hyper threading' part.

(Each CU can handle 40 programs of work, each of those consists of 64 threads, multiple up by CU count and you get the amount of 'in flight' work the GPU can handle).

The ACE, which is feeding the CU, handles work generation and dispatch, along with work dependency tracking - from a CPU point of view it is more like the kernel secular, working out what needs to be dispatched to each core (although instead of just assigning work its more like a case of "I need these resources, can anyone handle it?" for the work, with the ability to suspend work (and, iirc, pull the state back) when more important work is required to be run on a CU).

The amount of ACEs varies across hardware; at least 2, currently a max of 8.

#5252076 [D3D12] Enabling the depth test in d3d12

Posted by phantom on 13 September 2015 - 02:53 PM

It isn't that simple, in fact that abstraction is a bit of a lie ;)

In D3D12, for something like that, you need to have a Pipeline State Object setup which contains details on everything needed for a draw operation; These docs show how it is setup

#5251948 [D3D12] Barriers and Fences

Posted by phantom on 12 September 2015 - 05:20 PM

Resource barriers - adds commands to convert a resource (or resources) from one type to another (such as a render target to a texture), prevents further command execution until the GPU has finished doing any work needed to convert the resources as requested.

Fences - a marker in a command stream. Allows you to know when the GPU, or CPU, has finished doing some work so they can be synchronised.

Example; GPU command queue/list contains a fence which can be set by the CPU - this prevents the GPU from executing more commands until the CPU signals it is done. So, for example, if you have a command which reads from a buffer on the GPU and the CPU needs to fill that buffer, you'd insert a fence in to the GPU commands which tell it to wait until the CPU has signalled that the copy has completed.

Going the other way, you can use a fence the GPU has set to know where the GPU has got to executing. A good example of this added a fence command at the end of the frame, so the CPU knows when the GPU is done with the last frame's worth of data/buffers.

#5250157 What is latest version of DXSDK

Posted by phantom on 01 September 2015 - 01:12 PM

The DirectX SDK is now part of the Windows SDK - if you have that installed (which should be the case with VS2015) then you have the latest version.

#5247595 [D3D12] SetDescriptorHeaps

Posted by phantom on 19 August 2015 - 03:17 AM

Conceptually, that makes sense to me. The confusing part is that Set*RootDescriptorTable already takes a GPU descriptor handle, which defines the GPU pointer to the heap (is that correct?). Is there not enough information in the D3D12_GPU_DESCRIPTOR_HANDLE to identify the heap? I suppose I could see it as a way to simplify the backend by requiring the user to specify the exact set of buffers instead of gathering a list from descriptor handles (which would be more expensive). Secondly, can I provide the heaps in arbitrary order? Do they have to match up somehow with the way I've organized the root parameters?

I suspect it is done that way to let the driver decide what to do.
Given a heap base address and two pointers into that heap you can store the latter as offsets into the former, potentially taking up less register space vs a full blown reference. A look at the GCN docs would probably give a good reason for this, at least for that bit of hardware.

As for the order; seems not to matter.
I only did a simple test on this (D3D12DynamicIndexing example; swapped order of heaps @ Ln93 in FrameResource.cpp), but it worked fine so I'm willing to assume this holds true of all resources... until I find a situation which breaks it ;)

#5247353 Vulkan is Next-Gen OpenGL

Posted by phantom on 18 August 2015 - 04:28 AM

This might be of some interest to people, I've only just started looking at it myself however; Siggraph; An Overview of Next Gen Graphics APIs.

I bring it up because it mentions Vulkan, not that I've got to that slide deck yet biggrin.png

#5247128 DX12 SkyBox

Posted by phantom on 17 August 2015 - 10:26 AM

There is a better way to do things in general; don't do the sky first.

You just end up wasting overdraw on pixels which will never be seen.

#5247071 Vulkan is Next-Gen OpenGL

Posted by phantom on 17 August 2015 - 03:12 AM

1) You are making assumptions about Vulkan driver availability - I don't recall seeing anything stating you'll be able to use Vulkan back to Win7
1a) There are also hardware assumptions being made. No one has said what hardware Vulkan will cover on the PC. I suspect the same as DX12 is most likely. So you'll probably need to keep your existing DX11/OpenGL renderer around to cover older hardware.

2) You assume Win10 doesn't have a decent uptake rate. Before release it was sitting at 2.3% of the Windows market, 0.6% behind 32bit XP, from data provided by the Steam Hardware Survey. As I said this was before release, so only testers.

In short; DX12 and Vulkan will likely fill the same gap, hardware wise, so you'll need the old renderer still. Those tend to be DX11.

The other factor forgotten here is that DX12 also covers the Xbox One so the API hits two targets.
Vulkan will get you Win and Linux, but the former will likely be covered by DX11/12 due to 12 hitting first so it'll be down to any Linux support people want to throw out.

By not coming out first Vulkan has lost mindshare, interest and technically relevance. Much like latter GL versions, which finally after years of lagging behind produced functions DX11 didn't have, any functionality is unlikely to be adopted due to a bedding in of DX12 focused systems.

Of course there might well not be any real difference in ability, everyone is targeting the same hardware after all and the APIs reflect that hardware pretty well, so aside from platform specific extensions/features I'm not sure what Vulkan could offer that DX12 wouldn't already cover.

All of which, of course, if speculation as months on from GDC we simply don't know anything beyond the API looking very much like DX12s.

#5246915 DX12 Multithreaded Rendering Architecture

Posted by phantom on 16 August 2015 - 09:29 AM

1) Basically, yes. You'll want to build everything up front (command lists, update any buffers from the CPU side/setup copy operations) before pushing them to the GPU to execute allowing the GPU to chew on that data while you setup the next things for it to render. You can overlap things of course, so it doesn't have to be a case of [generate everything][push everything to gpu], you could execute draw commands as things finish up. So you could generate say all the shadow map command lists, then push those to the gpu before generating the colour passes. (in fact you could dedicate one 'task' to pushing while at the same time starting to generate the colour pass lists.)

2) Yes and no.
Generally its accepted to mean the first bit, that draw calls are generated across multiple threads and queued as work by a single thread (or task) to ensure correct ordering.
That said if you could keep your dependencies in order then there is nothing stopping you queuing work from multiple threads, although I'd have to check the thread safety of the various command queues to see what locks/protection you might need.

However your 'render to various textures' thing brings up a second part; the GPU is itself highly threaded so even if you have one thread pushing execute commands the GPU itself can have multiple commands in flight at once (dependencies allowing) so regardless of what method you use to queue work to the device it can be doing multiple things at the same time.

#5245788 D3D 12: Using fence objects

Posted by phantom on 11 August 2015 - 12:09 PM

1. There is no direct link between the two. Links are 'created' simply by calling 'signal' after 'execute command list' on a queue.

2. 'From the GPU side' simply means doing it in the GPU's time frame. So calling 'signal' will insert a command into the command stream to have the GPU execute which sets the signal to a value. This would happen inside the command processor of the GPU.

3. The command queue wait stalls the GPU's command processor until the fence is signalled. The Win32 API function stalls the CPU until the fence is signalled.

From a practical stand point;
- CommandQueue::Wait() causes the GPU's command processor to wait for the fence to be signalled. Lets say you have a command list which is running a compute shader and a graphics command list which is going to do some graphics commands which depends on the output of that compute work. You can submit both lists to separate queues and have the graphics queue wait on the fence from the compute queue before executing the graphics commands. Without this the two commands could execute at the same time if the GPU in question has separate graphics and compute queue hardware. (Maxwell 2 and GCN are both examples of this).

A second example would be doing a texture upload via a copy queue; you'd want to make sure the copy was complete before allowing any work which depended on it to reference it, so again you'd put a 'signal' in the copy queue and 'wait' on it in the graphics queue.

- Win32 wait would be used when you want to cause a CPU thread to sleep until the GPU has done some work. A simple example of this is waiting for all of a scene to be drawn before submitting the next batch of work.

A good example of all of this is the ExecuteIndirect example in the DirectX Graphics GitHub examples; https://github.com/Microsoft/DirectX-Graphics-Samples

#5245456 Backface culling in geometry shader?

Posted by phantom on 10 August 2015 - 07:37 AM

What are you trying to do which makes you think you need to use the GS?
Generally there is likely to be a way of doing it which will perform better.

#5245447 Backface culling in geometry shader?

Posted by phantom on 10 August 2015 - 07:06 AM

You don't improve the performance by adding an extra stage, and certainly not the geometry shader stage which is a known performance sinkhole.

Turn on back face culling on the API and let the hardware do its thing.

#5245208 [D3D12] Multidraw, Resource binding

Posted by phantom on 09 August 2015 - 03:21 AM

HOWEVER, as you aren't wandering off into user mode driver town every time you do a draw call, because you are recording them into a client side buffer, this isn't a bad thing.

MultiDraw, and its ilk, are pretty much there as an optimisation and would allow the driver to hit some faster paths because it knows, between draw calls, that you aren't changing state.

Useful in a client-server setup.
Useful in a high overhead setup.
Less useful in a thin API.

In fact with the executeindirect functionality of D3D12 (which gives you the ability to change some root signature constants, and indeed vertex and index buffer locations based on a buffer input) you can do practically more things with better functionality. (One thread writes a command list with N indirect calls, one thread (or the GPU!) writes a buffer with draw call information, bit of fence sync magics, and bam! loads of draws!).