Jump to content
  • Advertisement
Sign in to follow this  
Ubik

OpenGL Cases for multithreading OpenGL code?

This topic is 1673 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I have wanted to support multiple contexts to be used in separate threads with shared resources in this convenience GL wrapper I've been fiddling with. My goal hasn't been to expose everything it can do but in a nice way, but the multi-context support has seemed like a good thing, to align the wrapper with a pretty big aspect of the underlying system. However, multithreaded stuff is hard, so I finally started to question if supporting multiple contexts with resource sharing is even worth it.

 

If the intended use is PC gaming - a single simple window and so on (a single person project too, to put things to scale, and currently targeting version 3.3 if that has any relevance), what reasons would there be to take the harder route? My understanding is that the benefits might actually be pretty limited, but my knowledge of all the various implementations and their capabilities is definitely limited.

Share this post


Link to post
Share on other sites
Advertisement

Alright, I honestly hadn't expected this to be so clear-cut. A bit sad and maybe a little ironic too that GPU rendering can't be parallelized from client side too, with OpenGL anyway.

 

Thank you for sharing the knowledge!

Share this post


Link to post
Share on other sites
Well, it's only a partly parallelisable problem as the GPU is reading from a single command buffer (well, in the GL/D3D model, the hardware doesn't work quite the same as Mantle shows giving you 3 command queues per device but still...) so at some point your commands have to get into that stream (be it by physically adding to a chunk of memory or inserting a jump instruction to a block to execute) so you are always going to a single thread/sync point going on.

However, command sub-buffer construction is a highly parallelisable thing, consoles have been doing it for ages, the problem is the OpenGL mindset seems to be 'this isn't a problem - just multi-draw all the things!' and the D3D11 "solution" was a flawed one because of how the driver works internally.

D3D12 and Mantle should shake this up and hopefully show that parallel command buffer construction is a good thing and that OpenGL needs to get with the program (or, as someone at Valve said, it'll get chewed up by the newer APIs).

Share this post


Link to post
Share on other sites

Doing texture streaming in parallel works great on all GPUs I've tested on, which include a shitload of Nvidia GPUs, at least an AMD HD7790 and a few Intel GPUs. It's essentially stutter-free.

Share this post


Link to post
Share on other sites

Here's a silly but related question. If I use a second context to upload data that takes, for example, a second to transfer over the PCI bus, will the main rendering thread stall while it waits for it's own per-frame data, or will the driver split the larger data into chunks, thereby allowing the two threads to interlace their data?

Share this post


Link to post
Share on other sites

Well, it's only a partly parallelisable problem as the GPU is reading from a single command buffer (well, in the GL/D3D model, the hardware doesn't work quite the same as Mantle shows giving you 3 command queues per device but still...) so at some point your commands have to get into that stream (be it by physically adding to a chunk of memory or inserting a jump instruction to a block to execute) so you are always going to a single thread/sync point going on.

However, command sub-buffer construction is a highly parallelisable thing, consoles have been doing it for ages, the problem is the OpenGL mindset seems to be 'this isn't a problem - just multi-draw all the things!' and the D3D11 "solution" was a flawed one because of how the driver works internally.

D3D12 and Mantle should shake this up and hopefully show that parallel command buffer construction is a good thing and that OpenGL needs to get with the program (or, as someone at Valve said, it'll get chewed up by the newer APIs).

Good clarification. Makes sense that even if the GPU has lots of pixel/vertex/computing units, the system controlling them isn't necessarily as parallel-friendly. For a non-hw person the number three sounds like a curious choice, but in any case it seems to make some intuitive sense to have the number close to a common number of CPU cores. That's excluding hyper-threading but that's an Intel thing so doesn't matter to folks at AMD. (Though there's the consoles with more cores...)

 

I'm wishing for something nicer than OpenGL to happen too, but it's probably going to take some time for things to actually change. Not on Windows here, so the wait is likely going to be longer still. Might as well use GL in the mean time.

 

Doing texture streaming in parallel works great on all GPUs I've tested on, which include a shitload of Nvidia GPUs, at least an AMD HD7790 and a few Intel GPUs. It's essentially stutter-free.

Creating resources or uploading data on a second context is what I've mostly had in mind earlier. I did try to find info on this, but probably didn't use the right terms because I got the impression that actually parallel data transfer isn't that commonly supported.

 

I've now thought that if I'm going to add secondary context support anyway, it will be in a very constrained way, so that the other context (or wrapper for it to be specific) won't be a general purpose one but targeting things like resource loading specifically. That could allow me to keep the complexity at bay.

Share this post


Link to post
Share on other sites
As we've pretty much got the answer to the original question I'm going to take a moment to quickly (and basically) cover a thing smile.png

Good clarification. Makes sense that even if the GPU has lots of pixel/vertex/computing units, the system controlling them isn't necessarily as parallel-friendly. For a non-hw person the number three sounds like a curious choice, but in any case it seems to make some intuitive sense to have the number close to a common number of CPU cores. That's excluding hyper-threading but that's an Intel thing so doesn't matter to folks at AMD. (Though there's the consoles with more cores...)


So, the number '3' has nothing to do with CPU core counts; when it comes to GPU/CPU reasoning very little of one directly impacts the other.

A GPU works by consuming 'command packets'; the OpenGL calls you make get translated by the driver into bytes the GPU can natively read and understand, in the same way a compiler transforms your code to binary for the CPU.

The OpenGL and D3D11 model of a GPU presents a case where the command stream is handled by a single 'command processor' which is the hardware which decodes the command packets to make the GPU do it's work. For a long time this was probably the case too so the conceptual model 'works'.

However, a recent GPU, such as AMD's Graphics Core Next series is a bit more complicated than that as the interface which deals with the commands isn't a single block but in fact 3 which can each consume a stream of commands.

First is the 'graphics command processor'; this can dispatch graphics and compute workloads to the GPU hardware to work - glDraw/glDispatch family of functions - and is where your commands end up.

Secondly there is the 'compute command processors' - these can handle compute only workloads. Not exposed via GL, I think OpenCL can kind of expose them but with Mantle it is a separate command queue. (The driver might make use of them as well behind the scenes)

Finally 'dma commands' which is a separate command queue to move data to/from the GPU which is handled in OpenGL behind the scenes by the driver (but in Mantle would allow you to kick your own uploads/downloads as required.

So the command queues as exposed by Mantle more closely mirror the operation of the hardware (it still hides some details) which explains why you have three, to cover the 3 types of command work the GPU can do.

If you are interested AMD have made a lot of this detail available which is pretty cool.
(Annoyingly NV are very conservative about their hardware details which makes me sad sad.png)

To be clear, you don't need to know this stuff although I personally find it interesting - this is also a pretty high level overview of the situation so don't take it as a "this is how GPUs work!" kinda thing smile.png

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!