As far I saw, it's better to create only the copy and compute queues you really need by the render logic (by a data oriented point of view). Of course you need to profile your implementation, better on different HIVs hardware.

If you are talking about how many compute queues create to run compute works in concurrency with graphics, probably the best number is one (with background priority).

Note also you cannot retrieve any information about the adapter engine configuration and implementation of hardware engines and works queues mapping, moreover nothing guarantee you that on a particular graphics architecture the number of the "hardware engines" remains on every devices of different performance/cost rank.

EDIT: since background/low priority queues are not available on current version of D3D12, just assign a high priority to the graphics queue.

"Recursion is the first step towards madness." - "Skegg?ld, Skálm?ld, Skildir ro Klofnir!"
Direct3D 12 quick reference: https://github.com/alessiot89/D3D12QuickRef/

Yours3!f

1,534

Author

September 14, 2015 10:27 AM

As far I saw, it's better to create only the copy and compute queues you really need by the render logic (by a data oriented point of view). Of course you need to profile your implementation, better on different HIVs hardware.

If you are talking about how many compute queues create to run compute works in concurrency with graphics, probably the best number is one (with background priority).

Note also you cannot retrieve any information about the adapter engine configuration and implementation of hardware engines and works queues mapping, moreover nothing guarantee you that on a particular graphics architecture the number of the "hardware engines" remains on every devices of different performance/cost rank.

EDIT: since background/low priority queues are not available on current version of D3D12, just assign a high priority to the graphics queue.

thank you :)

seems like for now one should suffice... MSDN vs the graphics samples is confusing, because on MSDN they have as many queues as threads in the example codes, but in the samples they have one only. They populate commadn lists on separate threads, and submit on the main graphics thread after syncing.

Blog:

http://extremeistan.wordpress.com/

Stuff I wrote:

https://github.com/Yours3lf/libmymath

https://github.com/Yours3lf/linux_gl_fps

https://github.com/Yours3lf/instanced_font_rendering

http://youtu.be/k8PYkihyGXA

https://github.com/scrawl/smaa-opengl

https://github.com/Yours3lf/gl_browser_gui

Follow me on twitter:

https://twitter.com/0martint

Alessio1989

4,648

September 14, 2015 12:13 PM

There is only one graphics queue queue per adapter node, but the same restriction does not apply to compute and especially copy queues as far I remember. There are no restriction to the number of threads submitting command lists on a single queue, of course you need some kind of synchronization between different threads.

"Recursion is the first step towards madness." - "Skegg?ld, Skálm?ld, Skildir ro Klofnir!"
Direct3D 12 quick reference: https://github.com/alessiot89/D3D12QuickRef/

Yours3!f

1,534

Author

September 14, 2015 03:17 PM

There is only one graphics queue queue per adapter node, but the same restriction does not apply to compute and especially copy queues as far I remember. There are no restriction to the number of threads submitting command lists on a single queue, of course you need some kind of synchronization between different threads.

yeah I know that, I guess I'll have to measure out if multiple command queues get me additional perf or not.

Blog:

http://extremeistan.wordpress.com/

Stuff I wrote:

https://github.com/Yours3lf/libmymath

https://github.com/Yours3lf/linux_gl_fps

https://github.com/Yours3lf/instanced_font_rendering

http://youtu.be/k8PYkihyGXA

https://github.com/scrawl/smaa-opengl

https://github.com/Yours3lf/gl_browser_gui

Follow me on twitter:

https://twitter.com/0martint

Alessio1989

4,648

September 14, 2015 04:25 PM

I did not test this, but I can guess having two copy queues with different priorities could be a good example where more than one queue are useful: a higher priority queue for things you need to load immediately before presentation and a "normal" priority queue for background copy operations.

"Recursion is the first step towards madness." - "Skegg?ld, Skálm?ld, Skildir ro Klofnir!"
Direct3D 12 quick reference: https://github.com/alessiot89/D3D12QuickRef/

Hodgman

52,717

September 14, 2015 10:54 PM

Threads on the CPU are used to gain access to multi-core and hyperthreading resources.

Queues on the GPU are used to gain access to multi-GPU and async-compute ("GPU shader hyperthreading") resources.

Don't make one queue per CPU thread just to make your life easier. Make them only where you explicitly intend to create GPU-side command concurrency. e.g. computing while rasterizing, or copying while computing.

. 22 Racing Series .

Yours3!f

1,534

Author

September 15, 2015 08:22 PM

I did not test this, but I can guess having two copy queues with different priorities could be a good example where more than one queue are useful: a higher priority queue for things you need to load immediately before presentation and a "normal" priority queue for background copy operations.

yeah of course that makes sense :)

Blog:

http://extremeistan.wordpress.com/

Stuff I wrote:

https://github.com/Yours3lf/libmymath

https://github.com/Yours3lf/linux_gl_fps

https://github.com/Yours3lf/instanced_font_rendering

http://youtu.be/k8PYkihyGXA

https://github.com/scrawl/smaa-opengl

https://github.com/Yours3lf/gl_browser_gui

Follow me on twitter:

https://twitter.com/0martint

Yours3!f

1,534

Author

September 15, 2015 08:23 PM

Threads on the CPU are used to gain access to multi-core and hyperthreading resources.

Queues on the GPU are used to gain access to multi-GPU and async-compute ("GPU shader hyperthreading") resources.

Don't make one queue per CPU thread just to make your life easier. Make them only where you explicitly intend to create GPU-side command concurrency. e.g. computing while rasterizing, or copying while computing.

so what do you advise if I want to say do compute stuff while doing shadow map rendering (ie. only depth passes)

one graphics + one compute queue?

Blog:

http://extremeistan.wordpress.com/

Stuff I wrote:

https://github.com/Yours3lf/libmymath

https://github.com/Yours3lf/linux_gl_fps

https://github.com/Yours3lf/instanced_font_rendering

http://youtu.be/k8PYkihyGXA

https://github.com/scrawl/smaa-opengl

https://github.com/Yours3lf/gl_browser_gui

Follow me on twitter:

https://twitter.com/0martint

Hodgman

52,717

September 16, 2015 04:12 AM

so what do you advise if I want to say do compute stuff while doing shadow map rendering (ie. only depth passes)
one graphics + one compute queue?

Yes, and then all of the necessary events/fences to synchronize the resources that are being shared between the two queues (just like you would for code that was split across two threads on a CPU).

. 22 Racing Series .

[d3d12] command queues vs hardware queues (ACEs)

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

[d3d12] command queues vs hardware queues (ACEs)

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines