do command queues (https://msdn.microsoft.com/en-us/library/windows/desktop/dn788627(v=vs.85).aspx)
correspond directly to hardware queues aka ACEs on GCN?
ie. should I create the same number of compute queues as there are ACEs on the GPU?
I suppose there should be only one graphics queue, as the hardware (GCN) can only use one.
Is this the same with DMA copy engines? (same number of copy queues)
or should there be one command queue per async submission thread? (ie. 1 graphics/compute/copy queue per thread)
afaik it is advised to use one command allocator, one command list and one fence per thread. Is this true?