I think conditional branching in shaders only provides a performance hit if each SIMD instance might take a unique branch. If so, you will get a SIMD 'stall'. The shared branch instances will execute in the current active working set, then the other branch set will execute, and then eventually they will 'sync' up afer all branch sets and once again grinf away efficiently as SIMD across the entire working set.
But just because you take a hit doesn't mean you can't do it. Just be aware there is a hit. Instrument and measure the hit, compare normal cases with extremes. It might not be so bad, that totally depends on the logic.
This issue becomes more front and center with openCL but it also applies to shaders.
- Viewing Profile: Reputation: FGBartlett
Awesome job so far everyone! Please give us your feedback on how our article efforts are going. We still need more finished articles for our May contest theme: Remake the Classics
Community Stats
- Group Members
- Active Posts 6
- Profile Views 757
- Member Title Member
- Age Age Unknown
- Birthday Birthday Unknown
-
Gender
Not Telling
132
Neutral
User Tools
Contacts
FGBartlett hasn't added any contacts yet.
#4954736 Dynamic branching in shader not working. Keeps jumping out.
Posted by FGBartlett
on 01 July 2012 - 08:25 PM
#4954726 How many average number of threads does a game needs, regardless of simplicity?
Posted by FGBartlett
on 01 July 2012 - 08:07 PM
OpenGL is perfectly happy with multiple contexts sharing the same handle space via sharelisting. (However, not all flavors of OpenGl currently support sharelisting; OpenGL ES, for example, though it might in the future, according to the folks at Khronos.)
For 'desktop' OpenGL, It is often beneficial to have 'prep' threads and a main render thread that consumes resources prepared by those prep threads. But for that to happen, the contexts in each thread must share the same handle space. Prep threads can be used to isolate disk and other latency that must be dealt with to prepare resources from the main render thread, which should only ever deal with prepared resources.
A resource pool manager(that delivers availan;e handles and accepts freed handles), plus sharelisted threads isolated by threadsafe FIFOs, is more than adequate to guarantee collision free operation without expensive locls. (The only locls required are in the low duty cycle updates to FIFO state and pool manager state; the prep threads and main render thread spend most of their duty cycle prepping and rendering, and little time changing FIFO state, which is simply a matter of updating a couple integers for head and tail.)
Headsup with sharelisting; make sure all contexts that are going to be sharelisted are requested before any of them are selected as a current opengl congtext(and this for sure, before any resource handles are allocated among the contexts that will be sharelisted. Sharelisted means 'share the same resource handle space' which is required for multithread Opengl.
Headsup with the design of the threadsafe FIFO; it must use a two-step allocate and release model, because there is finite execution time between when a handle is pulled and when it is prepped or consumed. But that is easily done. A FIFO object is basically tracking a head and a tail in a circular fashion, with some maximum FIFO size. The FIFO should provife booleans for IsFull, IsEmpty, etc.
You don't have to do any of that when you write a game. It adds complexity. But it provides performance and behavior you can't achieve in a single threaded model.
as in --------Please wait....scene loading---------...
For 'desktop' OpenGL, It is often beneficial to have 'prep' threads and a main render thread that consumes resources prepared by those prep threads. But for that to happen, the contexts in each thread must share the same handle space. Prep threads can be used to isolate disk and other latency that must be dealt with to prepare resources from the main render thread, which should only ever deal with prepared resources.
A resource pool manager(that delivers availan;e handles and accepts freed handles), plus sharelisted threads isolated by threadsafe FIFOs, is more than adequate to guarantee collision free operation without expensive locls. (The only locls required are in the low duty cycle updates to FIFO state and pool manager state; the prep threads and main render thread spend most of their duty cycle prepping and rendering, and little time changing FIFO state, which is simply a matter of updating a couple integers for head and tail.)
Headsup with sharelisting; make sure all contexts that are going to be sharelisted are requested before any of them are selected as a current opengl congtext(and this for sure, before any resource handles are allocated among the contexts that will be sharelisted. Sharelisted means 'share the same resource handle space' which is required for multithread Opengl.
Headsup with the design of the threadsafe FIFO; it must use a two-step allocate and release model, because there is finite execution time between when a handle is pulled and when it is prepped or consumed. But that is easily done. A FIFO object is basically tracking a head and a tail in a circular fashion, with some maximum FIFO size. The FIFO should provife booleans for IsFull, IsEmpty, etc.
You don't have to do any of that when you write a game. It adds complexity. But it provides performance and behavior you can't achieve in a single threaded model.
as in --------Please wait....scene loading---------...
- Home
- » Viewing Profile: Reputation: FGBartlett

Find content