Jump to content
  • Advertisement
Sign in to follow this  
incertia

Vulkan Vulkan Queues

This topic is 521 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm working my way through some vulkan examples, and I just want to make sure that my understanding of queues is correct.
 
Each physical device has a set (\(Q\)) of queue families, and each queue family has some amount \(N_{q \in Q}\) queues that can be used. When creating a logical device, you specify some amount of logical queues you want to create, such that the sum of each queueCount property for each queue family \(q\) is not greater than \(N_q\) (maybe some unforseen circumstance has led us to a VkDeviceQueueCreateInfo array of [(family=0,cnt=2), (family=1,cnt=1), (family=0,cnt=1)]). The driver will take care of multiplexing the queues, e.g. if I create two logical devices with 8 queues each from the same queue family, whether or not the driver assigns both logical queues \(0, \dots, 7\) to physical queues \(0, \dots 7\) or \(0, \dots, 15\) is none of my concern, the plumbing is all taken care of by the driver.

 

Different queues can be submitted in paralle, but extra safety should be taken care of to make sure that the command buffers don't screw with each other if they interact with the same object. Retrieving queues gets retrieved in the order created. e.g. If my VkDeviceQueueCreateInfo array looked like [(cnt=2,priorities=[1.0,0.5]), (cnt=1,priorities=[0.2])], I can expect that vkGetDeviceQueue(device, family, [0, 1, 2]) has priorities [1.0, 0.5, 0.2]). Queue priorities are a relative number, such that the following metaphor makes sense: if each queue can be represented as a thread, a priority of 1.0 means that the thread should work as hard as it possibly can while a priority of 0.5 means that it should only work half as hard, with the union of all threads representing the queue processing power of the entire physical device.

 

If I said something wrong, please feel free to correct me. I want to make sure I'm not misunderstanding something fundamental.

Share this post


Link to post
Share on other sites
Advertisement

if I create two logical devices with 8 queues each from the same queue family


Create multiple devices from one physical device? Interesting idea. Are there any possible advantages? Why would you do this?


I can't spot anything wrong with what you say, i'm no expert, but i can share some experience:


I tried various numbers for the priorities for async compute on AMD, but IIRC the effect was either nor measureable or a slight loss - ended up using 1 for anything. Needs more testing.

Looking at my log we can be quite sure that VK queues do not match to hardware queues in any way:
found GPUs: 2

deviceName: GTX 670
apiVersion: 4194328
driverVersion: 1577369600
Queue family 0 (16 queues): graphics: 1 compute: 1 transfer: 1 sparse: 1 
Queue family 1 (1 queues): graphics: 0 compute: 0 transfer: 1 sparse: 0 

deviceName: AMD Radeon (TM) R9 Fury Series
apiVersion: 4194341
driverVersion: 4210689
Queue family 0 (1 queues): graphics: 1 compute: 1 transfer: 1 sparse: 1 
Queue family 1 (3 queues): graphics: 0 compute: 1 transfer: 1 sparse: 1 
Queue family 2 (2 queues): graphics: 0 compute: 0 transfer: 1 sparse: 1 
The AMD card has less queues in VK than in hardware, the NV card has lots ot VK queues but no async in hardware at all.

Interesting results i got from AMD:
* Doing graphics and compute in the same family 0 queue is faster than using a second compute queue from family 1. (When still doing all sequentially - did not try graphics and compute async).
* Async compute requires using multiple queues and command buffers, but even without synchronization there is some gap between command buffer execution, it may likely be large enough to eliminate the advantage of async :(

This gap between multiple command buffers happens also with only a single queue, and also on Nvidia.

Conclusion: You need a very good reason to use multiple queues / multiple command buffers.

(I work only on compute, can't say anything about graphics and if it makes a difference)

Share this post


Link to post
Share on other sites

Create multiple devices from one physical device? Interesting idea. Are there any possible advantages? Why would you do this?

This is a what if scenario, no real reason behind it.

 

Thanks for sharing your experiences though. I guess I'll stick with just creating one queue for now. 

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!