Sign in to follow this  

OpenGL Another multithreading problems with OpenGL

This topic is 4022 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

== short problem desc == - Can't display textures that loaded in another thread. == long problem desc == Hello. I've already search the GameDev and found many thread that discussed problems with multithreading with opengl, and most of answer suggest not to use multithreading with opengl. But it seems that I forced to use multithreading, based on my current condition: - I'm already in halfway through a game developing using OpenGL in Delphi. - The resource images that my program must preload before the game really is huge and the list keeps growing. That makes the waiting time became so long and long before I finally shown the opening screen. I'm also - Now I think about moving some of the loading parts into background threads that loading the leftover resources silently, when the foreground thread shows the opening screen. - The multithread is working nice. I just need to keep track of what texture that hasn't been loaded yet, and when that texture become available to the foreground thread system, I advance the opening screen to used the already loaded texture. For example: first I loaded the background screen in foreground thread. After that, I show the background screen moving, and I instruct the background thread to load the character face. So while the background screen is displayed, I silently load the character face. When the character's face is loaded completely, I slowly show the character's face to the screen in front of the background. And I instruct the background thread to load another images, and so on and on. - But the problem arise here. I'm definitely sure that the loading operation is working completely nice, because I'm only moved it and not making any modification, but it seems that *no* texture are being loaded to the openGL memory, even though the texture ID is valid (not zero). When I displayed one of the texture that already loaded by the background thread to the screen, it only shows a white rectangular, just like if the texture never been loaded to the memory. I checked out the texture ID once again, and yet it's still shows a non-zero ID (means that the texture already been loaded successfully into the memory). - I asked my friend that already has experienced the same sympton, and he said that it's really the limitation of OpenGL - that the OpenGL is not thread-safe. He said that once we leave the current thread which has the "true" OpenGL context, all the OpenGL calling method is not works, such as glBindTexture and such. But the one that keeps me wondering is that the OpenGL function that generates texture ID, the glGenTexture, is working! Why one function working, but another function is not working? I still doesn't understand until now. - I've read on another topic that the problem is located on the rendering context, that maybe I loaded the texture onto another context. - But, I've already passed the context that was created in the previous thread (foreground, renderer thread) onto the new thread (background, loader thread). And in the background thread, I used that context to load the texture, by using function such as the glBindTexture and gluBuild2DMipmaps, But still didn't change anything. But when I see some of the answers in this GameDev, even though it didn't answer my problems directly, I see that maybe there is a workaround solution over this limitation. Is there anyone can helps? Thanks.

Share this post


Link to post
Share on other sites
Are you calling glXMakeCurrent, aglSetCurrentContext or the Windows equivalent in a thread before you start issuing drawing commands? Also you can try having multiple contexts and sharing data, a good reference on this (if you're on Mac OS X) is this.

Share this post


Link to post
Share on other sites
Update:

Uhhh, alright... when reading, it seems that the answer is the call to the wglMakeCurrent. But there is still one thing that bothers me. wglMakeCurrent needs hdc, in which hdc means the handle to the device context. Isn't the hdc is only one: the current and the only window's handle that display my game screen? If I need to make a call of wglMakeCurrent in the new thread, which new "handle" I supposed to pass? Is it the old handle? Or I must create a new window and get it's display context? I don't really understand either how was the wglMakeCurrent related to the Multithreading by reading the MSDN documentation about wglMakeCurrent.

Share this post


Link to post
Share on other sites
@Abdulla:

I don't really sure what you mean. Are you talking about wglMakeCurrent (glXMakeCurrent in windows) in the Foreground thread or wglMakeCurrent in Background thread?

If you are talking about wglMakeCurrent in foreground thread, absolutely I've already called that. But if it's in background thread, I haven't yet. Because I think that once the context has been already set, all the gl and glu successive calls will used the current context, regardless which thread calls. Is it true, or not?

Share this post


Link to post
Share on other sites
Another update:

I've read more thoroughly that to use multithread with wglMakeCurrent, I *must* have a rendering context for each thread. But a new rendering context means a new device context, and a new device context means a new window, is it right? I don't really get it...

Share this post


Link to post
Share on other sites
The following is written w/o knowledge especially of WGL, so there might be something better out there.

The GL context stores the state of GL. Each thread can have its own GL context, allowing the use of GL also in multi-threaded environments.

Making a GL context current defines its use by the calling thread! AFAIK it should be possible to use the same GL context in several threads but it can be made current for at most one thread at a time. So if you want to "share" the context you have to ensure exclusive use, e.g. by protecting the context with a semaphore. This also requires the owning thread to release the context as soon as possible to make it available to other threads.

On invocations of wglMakeCurrent the device context and the render context need not always be the same. However, there are constraints like that the device context supports GL (of course) and has a matching pixel format.

Normally a context sharing is provided for display lists, textures, and buffer objects, but I don't found such a thing in WGL (what doesn't mean there isn't one).

Share this post


Link to post
Share on other sites
As previous posters said: the OpenGL context is only current for one thread at the time.

The most sensible solution to your problem (I think, since I'm using it [wink]) is to load and decode the texture in the resource thread. Then, give the render thread a signal that there is an image waiting to be uploaded to video memory. The render thread can then look at the list of waiting textures and upload them at the start of each frame.
That way you have the CPU and I/O-intensive stuff in a separate thread, and on-screen textures.

Share this post


Link to post
Share on other sites
Thanks for your reply! So much mystery is answered. It still has some question left though.


@Haegarr:

- You said that a rendering context can be used by many thread, but only *one* thread can make the rendering context current. Sorry, maybe my knowledge is still low, but isn't that rendering context is made current to the "context device", not to a "specific thread" -- and a thread is not necessarily must own a unique DC, isn't it? So, which device context I must use? Or, if I try to make assumption, I just re-call the wglMakeCurrent to the foreground thread's device context (that way, they used the same device context), but the call must be made within the new thread? -- I mean, it's not about the "device context" itself, it's about what device context I must supplied to the wglMakeCurrent in the new thread?

but I'll try your valuable advice that stated that the previous thread must release the context before the new thread use it. I didn't know that!


@DaBono:

- If I must call wglMakeCurrent for every frame, doesn't it means that the animation will not run smoothly when the signal is set? because what I render in foreground is not merely a "loading animation", but a full game opening.

but your advice is somewhat needs to be considered though... Thanks.

Share this post


Link to post
Share on other sites
In general, the device context defines where the output will be written to, e.g. which window will show the graphics. The render context is OpenGL's collection of all made textures, display lists, VBOs, and so on. It also stores the current depth comparison function, background clear color, whether or not you've currently invoked a glBegin but not a glEnd yet, and much more of those things. In short: It stores the OpenGL rendering state.

Each thread has the possibility to have at most 1 OpenGL context current. When a thread invokes wglMakeCurrent, it binds the overhanded rendering context to the device context, what means that output generated by using the rendering context has to go to the device context. Moreover, the context is set for the thread from which wglMakeCurrent was called. That means that exactly that thread is now using that rendering context. Until the thread releases the context no other thread is allowed to invoke wglMakeCurrent with just that context.

In fact, wglMakeCurrent binds 3 parts together: The device context, the rendering context, and the invoking thread. The latter dependency is of interest in multi-threaded environments only, and is hence often mentioned with less emphasis.

A thread can deal with multiple device contexts. It is also able to deal with several rendering contexts, but only 1 can be made current at a time. And that rendering context can only be current to 1 thread at a time.


EDIT: I bet that performance will suffer from often rebinding contexts. Since WGL seems to lack support of native context sharing, I think an approach as supposed by DaBono would be best if the desire is just to decouple loading and using of a texture, so at most the texture resource may be needed to be protected by a semaphore.

Share this post


Link to post
Share on other sites
Quote:
Original post by MightyMartin
@DaBono:
- If I must call wglMakeCurrent for every frame, doesn't it means that the animation will not run smoothly when the signal is set? because what I render in foreground is not merely a "loading animation", but a full game opening.


You don't need to call wglMakeCurrent each frame, but just once. I'll try to make myself a bit clearer. Your resource thread in pseudo-code will do something like:
while( resourcesToLoad ) {
FileHandle fp = filesystem.open( textureName );
Image *image = JPGLoader::load(fp);
textureQueue.push( image );
}
Then, your render thread is something like:
createWindow();
wglMakeCurrent( hwnd, ... );
InitGL();
while( gameRunning ) {
while( textureQueue.size() > 0 )
glTexImage2D( textureQueue.front(), .... );
renderFrame();
}

So you see, only one call for wglMakeCurrent is needed.
Unfortunately, the code above is a bit too simple (e.g. you need to do proper locking around the queue), but I think you can get started with this.

Share this post


Link to post
Share on other sites
@Haegarr:

I understand now! It's really depicted something that I feared most, that I must re-make current the rendering context everytime the control changes from drawing next frame with loading resources. So there is no escape. This means that I must tricking by inserting the loading process to be happen while the screen is showing a more static screen than the opening screen (eg: the game logo). Thanks for your help! Your information about wglMakeCurrent actually binds 3 things really strikes the point!


@DaBono:

Hmmm... I think I understand the concept. So you mean to put only the "non-openGL image processing" parts in the thread, and when the image data is ready, the only process that matters is only to generate the mipmaps that shares the same thread process.

That's a brilliant idea, ummm... I don't know whether the build mipmap process is run fast enough or not to chase "a smooth framerate while loading", but it worth a try. Thanks. :)


actually... both of your ideas can be combined, Haegarr and DaBono... :)

Share this post


Link to post
Share on other sites
@Rick Appleton:

I have once again read thoroughly the links that you suggest, and found out that I still can done my previous way, but by using wglShareLists. Unfortunately I've neither ever using list nor learnt something about list. I always used the traditional method, by acquired new TextureID to a new variable using glGenTextures. Right now, I'm doing some research about the wgl list, but if someone willing to share a general or just a very basic starting information about how to using list in wgl, like what function is commonly used when working with wgl list, I will appreciate it. Thanks! :)

Share this post


Link to post
Share on other sites
wglShareList is a bit badly named. It's a holdover from the old days when there weren't any objects except for display lists.

Nowadays we have all kinds of objects in OpenGL (texture, VBO, GLSL shader, ASM shaders, etc). All of these are shared with wglShareList.

So if you call wglShareList at the correct time in the correct thread, you can just keep using glGenTextures and family and the texture will be shared between the different contexts.

Share this post


Link to post
Share on other sites
Quote:
Original post by rick_appleton
wglShareList is a bit badly named. It's a holdover from the old days when there weren't any objects except for display lists.

Nowadays we have all kinds of objects in OpenGL (texture, VBO, GLSL shader, ASM shaders, etc). All of these are shared with wglShareList.

Ah, that's exactly what I had in mind at the bottom of my 1st reply. Good luck that there actually _is_ such context sharing also in WGL :)

Share this post


Link to post
Share on other sites
Quote:
Original post by MightyMartin

That's a brilliant idea, ummm... I don't know whether the build mipmap process is run fast enough or not to chase "a smooth framerate while loading", but it worth a try. Thanks. :)


actually... both of your ideas can be combined, Haegarr and DaBono... :)


This is also the way i do it. You will be pleased to know that by using this before glTexImage2D, you can let the gfx card build mipmaps for you. This is much much faster than using gluBuild2DMipMaps.

glTexParameteri(GL_TEXTURE_2D,GL_GENERATE_MIPMAP_SGIS,GL_TRUE);

Share this post


Link to post
Share on other sites
@Rick_appleton, Haegarr:


I've read this tutorial:

http://www.lighthouse3d.com/opengl/displaylists/

and I think I've gain an insight about the display list and wglShareLists. The display list itself is part of openGL programming. The only thing that really belongs to the wgl is the wglShareLists. But I have some common questions:

- from what I learnt, the display list is some kind of macro storing commands to be used later. Is it right?

- I don't really understood that the wglShareLists really "share" textures (and VBO, GLSL shader, ASM shaders, etc). Isn't that what wglShareLists really share is a list of ... command list? I mean, if you stated that wglShareLists share almost "anything", including textures, then I don't even need the use of display list -- I just called wglShareLists at the beginning of the thread call, and then, *poof*, all the successive openGL commands like glBindTexture, glGenTexture, and gluBuild2DMipMaps in the background loader thread will be valid. Is it right, or ... else?

- btw, I have read the zppz's tutorial, and searching again through the MSDN, and get conclusion that wglShareLists (hglrc1, hglrc2) must be called *before* the rendering context 2 (hglrc2) load any resources, such as textures. So, like you said, it's alright if the hglrc1 already has some resource loaded. That resource is also shared to hglrc2 when the wglSharelists is called. But from what I read, it's not right if the hglrc2 also has already some resource loaded too. So, in hglrc2, there must be no resource preloaded before the share attempt. I don't know what will happen if this is done. I tried to search from where I read this, but I can't found it again.

- and furthermore, by the existence of wglShareLists (which from what you both depicted, should be called wglShareResources... =] ), isn't this really a simple one-for-all solution to OpenGL Multithreading -- I mean, just call wglShareLists, and then all the multithreading concept that reside inside most of the newbies will be work? (oh! I forgot presence of the wglMakeCurrent! is it still have to be taken into account to?)

Thanks.

Share this post


Link to post
Share on other sites
Quote:
Original post by MightyMartin
- from what I learnt, the display list is some kind of macro storing commands to be used later. Is it right?

Yep. There is some kind of internal compilation that allows execution of GL routines (what you've named "commands") being more effective than passing each one for its own. Not all GL routines are allowed to be invoked in a display list, and some other caveats may appear, too.

Display lists are nice for things like bitmap based fonts (e.g. 1 display list for each glyph), macros for primitive shapes, and similar things.

Quote:
Original post by MightyMartin
- I don't really understood that the wglShareLists really "share" textures (and VBO, GLSL shader, ASM shaders, etc). Isn't that what wglShareLists really share is a list of ... command list? I mean, if you stated that wglShareLists share almost "anything", including textures, then I don't even need the use of display list -- I just called wglShareLists at the beginning of the thread call, and then, *poof*, all the successive openGL commands like glBindTexture, glGenTexture, and gluBuild2DMipMaps in the background loader thread will be valid. Is it right, or ... else?

Notice that a context share does not share everything. A GL context, as said somewhere above, contains "resources" like display lists, textures, VBOs, ... but also the "render state", like the clear color, whether depth comparison is on and which functions it uses, and bah, everything else. Context sharing means to share the "resources" but not the "render state"!

In the very early days there were no VBOs, no texture objects (i.e. only 1 texture at a time), but there were already display lists. The routine wglShareLists originates from that time. The shared resources got more and more since then, but the name of the routines was never changed. (Notice that I repeat rick_appleton's statement here.)

glBindTexture is executed in a thread and alters the "render state" but not yet the "resources", so it has no shared effect. _But_, if the thread executes a glGenTextures, then the texture object is allocated so that is available across the shared contexts. If then a thread executes glTexImage then the texture object is filled with data and hence pixel data becomes available across the shared contexts. That's the way it works.

Quote:
Original post by MightyMartin
- btw, I have read the zppz's tutorial, and searching again through the MSDN, and get conclusion that wglShareLists (hglrc1, hglrc2) must be called *before* the rendering context 2 (hglrc2) load any resources, such as textures. So, like you said, it's alright if the hglrc1 already has some resource loaded. That resource is also shared to hglrc2 when the wglSharelists is called. But from what I read, it's not right if the hglrc2 also has already some resource loaded too. So, in hglrc2, there must be no resource preloaded before the share attempt. I don't know what will happen if this is done. I tried to search from where I read this, but I can't found it again.

In all other windowing systems I know (X Window's GLX, MacOS X's AGL, and AFAIK PS/2's PGL), you must specify the shared context already when _creating_ another context. Only in WGL there is such a problem of "when to do the sharing". Well, I suggest you definitely to execute the sharing very close after creating the 2nd context! To make it clear: It better not to relate sharing to the beginning of the 2nd thread, but to the creation of the 2nd context.

Why to share _before_ allocating local resources: E.g. if both belonging contexts have already allocated a (different) texture by glGenTextures, how should they share them if both have the texture name 1? It couldn't work.

Quote:
Original post by MightyMartin
- and furthermore, by the existence of wglShareLists (which from what you both depicted, should be called wglShareResources... =] ), isn't this really a simple one-for-all solution to OpenGL Multithreading -- I mean, just call wglShareLists, and then all the multithreading concept that reside inside most of the newbies will be work?

Sorry, but I don't understand what you mean.

Quote:
Original post by MightyMartin
(oh! I forgot presence of the wglMakeCurrent! is it still have to be taken into account to?)

wglMakeCurrent must still be executed, right you are. Most GL functions does not work otherwise. However, you have to invoke it normally only once per thread.

Share this post


Link to post
Share on other sites
First a thanks to Haegarr for the explanations. They are more detailed than I could be bothered to do, and are pretty much spot on.

I'll quickly go through your list of questions.
Quote:
Original post by MightyMartin
- from what I learnt, the display list is some kind of macro storing commands to be used later. Is it right?

That's pretty much correct yes.

Quote:
Original post by MightyMartin
- I don't really understood that the wglShareLists really "share" textures (and VBO, GLSL shader, ASM shaders, etc). Isn't that what wglShareLists really share is a list of ... command list? I mean, if you stated that wglShareLists share almost "anything", including textures, then I don't even need the use of display list -- I just called wglShareLists at the beginning of the thread call, and then, *poof*, all the successive openGL commands like glBindTexture, glGenTexture, and gluBuild2DMipMaps in the background loader thread will be valid. Is it right, or ... else?

Like Haegarr said, resources will be shared. OpenGL has state, and that is not shared.

Quote:
Original post by MightyMartin
- btw, I have read the zppz's tutorial, and searching again through the MSDN, and get conclusion that wglShareLists (hglrc1, hglrc2) must be called *before* the rendering context 2 (hglrc2) load any resources, such as textures. So, like you said, it's alright if the hglrc1 already has some resource loaded. That resource is also shared to hglrc2 when the wglSharelists is called. But from what I read, it's not right if the hglrc2 also has already some resource loaded too. So, in hglrc2, there must be no resource preloaded before the share attempt. I don't know what will happen if this is done. I tried to search from where I read this, but I can't found it again.

This is indeed what most sources of information say. From my tests however, it appears that all resources loaded in either thread at any time are shared across all contexts. It is probably safest to not rely on this fact though, so you should just code as if only resources loaded after wglShareLists are shared. The issue mentioned by Haegarr could theoretically happen, but I suspect that the drivers aren't aware of this up to that point. I haven't tested this, but I wouldn't be surprised if you create two textures in two totally unrelated apps, you actually get texture IDs 1 and 2 for example, and not 1 and 1.

Quote:
Original post by MightyMartin
- and furthermore, by the existence of wglShareLists (which from what you both depicted, should be called wglShareResources... =] ), isn't this really a simple one-for-all solution to OpenGL Multithreading -- I mean, just call wglShareLists, and then all the multithreading concept that reside inside most of the newbies will be work?

It does indeed seem to work that way. Note that you still have to be careful that you don't use a texture at the same time you are uploading it's data in another thread. Something like that will likely lead to 'undefined' behaviour, which basically means that anything can happen.

Quote:
Original post by MightyMartin
(oh! I forgot presence of the wglMakeCurrent! is it still have to be taken into account to?)

As Haegarr already mentioned, you still have to call wglMakeCurrent once from inside each thread.

Share this post


Link to post
Share on other sites
About the usage of wglMakeCurrent:

can I do something like these:

In the global variable:

int protection = 0;


In foreground, renderer thread:

void mainloop ()
{
if (protection != 0) return;
// beginning entering mainloop, set the flag.
protection = 1;

wglMakeCurrent (hDC1, hglrc1);

// do the drawing things here;
...
...

wglMakeCurrent (NULL, NULL);

// release the protection
protection = 0;
}


and in the background, loader thread:



// this is used to check the availability to "make current"
int protector ()
{
if (protection != 0) return 0;
protection = 2;
wglMakeCurrent (hDC1, hglrc1);
return 1;
}

void releaseprotection ()
{
protection = 0;
wglMakeCurrent (NULL, NULL);
}

void backgroundthread ()
{
// the background thread will stuck in here until foreground thread has
// already finished it's circle
while (!protector()) {}
LoadTexture1 ();
releaseprotection ();

// the background thread will stuck in here until foreground thread has
// already run once.
while (protection != 1) {}

while (!protector()) {}
LoadTexture2 ();
releaseprotection ();

while (protection != 1) {}

while (!protector()) {}
LoadTexture3 ();
releaseprotection ();

...
...
...
}


so this way, the foreground thread still has some time left to render next frame between each texture loading. but, raegarr, you said that this solution will suffer the performance because of the intensive use of wglMakeCurrent, right?


Quote:
A thread can deal with multiple device contexts. It is also able to deal with several rendering contexts, but only 1 can be made current at a time. And that rendering context can only be current to 1 thread at a time.


by the way, from your sentence, is it possible to wglMakeCurrent (hDC1, hglrc1) in the first thread and wglMakeCurrent (hDC2, hglrc2) in the second thread at one simultaneous time? (where hDC1 != hDC2, and hglrc1 != hglrc2)


Quote:
As Haegarr already mentioned, you still have to call wglMakeCurrent once from inside each thread.


only once inside each thread? do you mean only need for the beginning of the thread only? don't we need to call wglMakeCurrent each time we moved across the thread?

Share this post


Link to post
Share on other sites
Quote:
Original post by MightyMartin
About the usage of wglMakeCurrent:

can I do something like these:
code removed

First: I can't guarantee that the following is really true for the OS and development environment you use, but I think it is. However, even if it is false you should verify its falseness. Not considering the below aspect normally results in sporadic occurances of very strange behaviour, and hence is very hard to debug.

You use a global variable "protection" here as a semaphore. While that is okay in principle, you have to notice that accesses to a normal variable are neither synchronized nor atomic. The latter aspect means that during the time the one thread performs reading, modifying, and writing the variable, the thread scheduler may stop its execution and switch to another thread. Its even more problematic with HT, Dual Core, and so on. That other thread probably accesses the same variable then before the 1st thread was able to complete its operation. When later the 1st thread is scheduled again, it seems all okay, but now both threads mean they are allowed to use the context. This situation is very very bad, believe me! Look out for a real implementation of thread synchronization for windows. I assume the OS already provides such stuff. I don't program under windows (yet), so I can't tell you exactly what to use, sorry. Look for the keywords "semaphore" and perhaps "critical section" w.r.t. multi-threading.

Quote:
Original post by MightyMartin
Quote:
A thread can deal with multiple device contexts. It is also able to deal with several rendering contexts, but only 1 can be made current at a time. And that rendering context can only be current to 1 thread at a time.

by the way, from your sentence, is it possible to wglMakeCurrent (hDC1, hglrc1) in the first thread and wglMakeCurrent (hDC2, hglrc2) in the second thread at one simultaneous time? (where hDC1 != hDC2, and hglrc1 != hglrc2)

Absolutely. But you may need pay attention when accessing resources in the shared contexts, as rick_appleton has already mentioned somewhere earlier.

Quote:
Original post by MightyMartin
Quote:
As Haegarr already mentioned, you still have to call wglMakeCurrent once from inside each thread.

only once inside each thread? do you mean only need for the beginning of the thread only? don't we need to call wglMakeCurrent each time we moved across the thread?

After successfully making a context current you can use it until the executed thread invokes wglMakeCurrent(NULL,NULL) or the application goes down. It is costly (and perhaps impossible?) to detect thread switching within the thread, and since releasing a render context means to force it to flush, it becomes impractical for your needs at all. That is why you should use thread synchronization (e.g. with a semaphore or the like) where needed. But, as said, using shared contextsis the better solution (what doesn't mean that you may have the need to synchronize the threads in the one or other way).

(EDIT: typo)

[Edited by - haegarr on November 28, 2006 4:42:14 AM]

Share this post


Link to post
Share on other sites
Haegarr has touched on all your issues already so I'll just leave you with a suggestion and a promise.

If you're not using Boost yet, I suggest you look into it. It has a large collection of useful modules, among which a Threads library which is supposed to be more or less portable to other OSs. I used that to implement my multithreaded app.

Second, when I have time at home again (probably not before Friday unfortunately) I'll have a look and see if I can find my simple testbed I used when starting to implement multi-threaded loading in OpenGL. If I can find it, I'll post it here.

Share this post


Link to post
Share on other sites
I don't think this came up on this thread, and I'm not aware of the state of current drivers from ATI or nVidia, but there was a problem of memory leaking when you used wglMakeCurrent "too much".
Almost looking like the drivers implemented a stack of contexts and wglMakeCurrent just pushed a context on to that stack. So always call wglMakeCurrent(0,0) before calling it with real context.

ch.

Share this post


Link to post
Share on other sites
Quote:
Original post by christian h
I don't think this came up on this thread, and I'm not aware of the state of current drivers from ATI or nVidia, but there was a problem of memory leaking when you used wglMakeCurrent "too much".
Almost looking like the drivers implemented a stack of contexts and wglMakeCurrent just pushed a context on to that stack. So always call wglMakeCurrent(0,0) before calling it with real context.

ch.


Well, the plan is to only call it once per thread, so this shouldn't really be a problem. Also, I think I remember someone saying this has finally been fixed.

Share this post


Link to post
Share on other sites
@christian h:

It's okay. I've managed to accostumed myself to always make the current context is not current before calling wglMakeCurrent. In fact, I just happen to know right now that calling to wglMakeCurrent when the current context is still current, IS possible. :) many thanks for the info.


@rick_appleton, haegarr:

Once again thanks for the thorough explanation. I and my programming team will try to work around the solution you both explained for a couple of days, to see if we can really implement it. -- because the deadline is coming closer. the game material itself is near to finish. the only critical problem is just the loading time is took to long (near 10 - 20 seconds on a high end computer -- we can't image if a low end computer run our game!). so if the we can make it in the time, then hurray! :)

by the way Rick, is that "boost" you tell me is from the www.boost.org library? if it is, then it's really a vast and huge library! unfortunately, it's written in C++ language. since I'm actually using Delphi (pascal) language, then the library only can be a resource. But it's more than enough for a reference.

and, with haegarr's infos, it's possible to make more than one context be current at one simultaneous time, given at these condition:
1. it's not in a same thread (absolutely!)
2. it's not from a same device context
3. it's not from a same rendering context

so, if I imagine it, I can give a pseudo code like this:


int foreground_thread_init = 0;

void foreground_thread_mainloop()
{
if (! foreground_thread_init)
{
// the first wglMakeCurrent
create compatible device context hDC1 from 1st window handle hWnd1;
create hglrc1 from hDC1;
wglMakeCurrent (hDC1, hglrc1);
foreground_thread_init = 1;
run background_thread_run_only_once() in separate thread process;
}

just do some drawing things on here based on what texture already loaded
successfully by background process;
}

void background_thread_run_only_once ()
{
// the second wglMakeCurrent
create compatible device context hDC2 from 2nd window handle hWnd2;
create hglrc2 from hDC2;
wglShareContext (hglrc1, hglrc2);
wglMakeCurrent (hDC2, hglrc2);
background_thread_init = 1;

// start to do the main process
load texture 1;
load texture 2;
load texture 3;
...
...
}


the main point of the code up there is that I can call the second wglMakeCurrent in the background thread without needs to worrying about the first call of wglMakeCurrent, because it uses different hglrc, hDC, hWnd, and thread. All that I must worry about is the hglrc1 and hglrc2 must compatible each other so I can share the context, isn't it? -- I mean, this is almost looks identical to my original source code, except that I must add new initialization the hWnd, hDC, and hglrc on the background thread.

Share this post


Link to post
Share on other sites

This topic is 4022 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Forum Statistics

    • Total Topics
      628764
    • Total Posts
      2984575
  • Similar Content

    • By test opty
      Hi,

      Please read the Linking Vertex Attributes section of this page. I've read the term a vertex attribute many times in this page but I'm not sure what it means for real or what the author meant by that.
      If possible please tell me the exact meaning the author meant by vertex attributes.
    • By alex1997
      I'm looking to render multiple objects (rectangles) with different shaders. So far I've managed to render one rectangle made out of 2 triangles and apply shader to it, but when it comes to render another I get stucked. Searched for documentations or stuffs that could help me, but everything shows how to render only 1 object. Any tips or help is highly appreciated, thanks!
      Here's my code for rendering one object with shader!
       
      #define GLEW_STATIC #include <stdio.h> #include <GL/glew.h> #include <GLFW/glfw3.h> #include "window.h" #define GLSL(src) "#version 330 core\n" #src // #define ASSERT(expression, msg) if(expression) {fprintf(stderr, "Error on line %d: %s\n", __LINE__, msg);return -1;} int main() { // Init GLFW if (glfwInit() != GL_TRUE) { std::cerr << "Failed to initialize GLFW\n" << std::endl; exit(EXIT_FAILURE); } // Create a rendering window with OpenGL 3.2 context glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3); glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 2); glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE); glfwWindowHint(GLFW_OPENGL_FORWARD_COMPAT, GL_TRUE); glfwWindowHint(GLFW_RESIZABLE, GL_FALSE); // assing window pointer GLFWwindow *window = glfwCreateWindow(800, 600, "OpenGL", NULL, NULL); glfwMakeContextCurrent(window); // Init GLEW glewExperimental = GL_TRUE; if (glewInit() != GLEW_OK) { std::cerr << "Failed to initialize GLEW\n" << std::endl; exit(EXIT_FAILURE); } // ----------------------------- RESOURCES ----------------------------- // // create gl data const GLfloat positions[8] = { -0.5f, -0.5f, 0.5f, -0.5f, 0.5f, 0.5f, -0.5f, 0.5f, }; const GLuint elements[6] = { 0, 1, 2, 2, 3, 0 }; // Create Vertex Array Object GLuint vao; glGenVertexArrays(1, &vao); glBindVertexArray(vao); // Create a Vertex Buffer Object and copy the vertex data to it GLuint vbo; glGenBuffers(1, &vbo); glBindBuffer(GL_ARRAY_BUFFER, vbo); glBufferData(GL_ARRAY_BUFFER, sizeof(positions), positions, GL_STATIC_DRAW); // Specify the layout of the vertex data glEnableVertexAttribArray(0); // layout(location = 0) glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, 0); // Create a Elements Buffer Object and copy the elements data to it GLuint ebo; glGenBuffers(1, &ebo); glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ebo); glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(elements), elements, GL_STATIC_DRAW); // Create and compile the vertex shader const GLchar *vertexSource = GLSL( layout(location = 0) in vec2 position; void main() { gl_Position = vec4(position, 0.0, 1.0); } ); GLuint vertexShader = glCreateShader(GL_VERTEX_SHADER); glShaderSource(vertexShader, 1, &vertexSource, NULL); glCompileShader(vertexShader); // Create and compile the fragment shader const char* fragmentSource = GLSL( out vec4 gl_FragColor; uniform vec2 u_resolution; void main() { vec2 pos = gl_FragCoord.xy / u_resolution; gl_FragColor = vec4(1.0); } ); GLuint fragmentShader = glCreateShader(GL_FRAGMENT_SHADER); glShaderSource(fragmentShader, 1, &fragmentSource, NULL); glCompileShader(fragmentShader); // Link the vertex and fragment shader into a shader program GLuint shaderProgram = glCreateProgram(); glAttachShader(shaderProgram, vertexShader); glAttachShader(shaderProgram, fragmentShader); glLinkProgram(shaderProgram); glUseProgram(shaderProgram); // get uniform's id by name and set value GLint uRes = glGetUniformLocation(shaderProgram, "u_Resolution"); glUniform2f(uRes, 800.0f, 600.0f); // ---------------------------- RENDERING ------------------------------ // while(!glfwWindowShouldClose(window)) { // Clear the screen to black glClear(GL_COLOR_BUFFER_BIT); glClearColor(0.0f, 0.5f, 1.0f, 1.0f); // Draw a rectangle made of 2 triangles -> 6 vertices glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_INT, NULL); // Swap buffers and poll window events glfwSwapBuffers(window); glfwPollEvents(); } // ---------------------------- CLEARING ------------------------------ // // Delete allocated resources glDeleteProgram(shaderProgram); glDeleteShader(fragmentShader); glDeleteShader(vertexShader); glDeleteBuffers(1, &vbo); glDeleteVertexArrays(1, &vao); return 0; }  
    • By Vortez
      Hi guys, im having a little problem fixing a bug in my program since i multi-threaded it. The app is a little video converter i wrote for fun. To help you understand the problem, ill first explain how the program is made. Im using Delphi to do the GUI/Windows part of the code, then im loading a c++ dll for the video conversion. The problem is not related to the video conversion, but with OpenGL only. The code work like this:

       
      DWORD WINAPI JobThread(void *params) { for each files { ... _ConvertVideo(input_name, output_name); } } void EXP_FUNC _ConvertVideo(char *input_fname, char *output_fname) { // Note that im re-initializing and cleaning up OpenGL each time this function is called... CGLEngine GLEngine; ... // Initialize OpenGL GLEngine.Initialize(render_wnd); GLEngine.CreateTexture(dst_width, dst_height, 4); // decode the video and render the frames... for each frames { ... GLEngine.UpdateTexture(pY, pU, pV); GLEngine.Render(); } cleanup: GLEngine.DeleteTexture(); GLEngine.Shutdown(); // video cleanup code... }  
      With a single thread, everything work fine. The problem arise when im starting the thread for a second time, nothing get rendered, but the encoding work fine. For example, if i start the thread with 3 files to process, all of them render fine, but if i start the thread again (with the same batch of files or not...), OpenGL fail to render anything.
      Im pretty sure it has something to do with the rendering context (or maybe the window DC?). Here a snippet of my OpenGL class:
      bool CGLEngine::Initialize(HWND hWnd) { hDC = GetDC(hWnd); if(!SetupPixelFormatDescriptor(hDC)){ ReleaseDC(hWnd, hDC); return false; } hRC = wglCreateContext(hDC); wglMakeCurrent(hDC, hRC); // more code ... return true; } void CGLEngine::Shutdown() { // some code... if(hRC){wglDeleteContext(hRC);} if(hDC){ReleaseDC(hWnd, hDC);} hDC = hRC = NULL; }  
      The full source code is available here. The most relevant files are:
      -OpenGL class (header / source)
      -Main code (header / source)
       
      Thx in advance if anyone can help me.
    • By DiligentDev
      This article uses material originally posted on Diligent Graphics web site.
      Introduction
      Graphics APIs have come a long way from small set of basic commands allowing limited control of configurable stages of early 3D accelerators to very low-level programming interfaces exposing almost every aspect of the underlying graphics hardware. Next-generation APIs, Direct3D12 by Microsoft and Vulkan by Khronos are relatively new and have only started getting widespread adoption and support from hardware vendors, while Direct3D11 and OpenGL are still considered industry standard. New APIs can provide substantial performance and functional improvements, but may not be supported by older hardware. An application targeting wide range of platforms needs to support Direct3D11 and OpenGL. New APIs will not give any advantage when used with old paradigms. It is totally possible to add Direct3D12 support to an existing renderer by implementing Direct3D11 interface through Direct3D12, but this will give zero benefits. Instead, new approaches and rendering architectures that leverage flexibility provided by the next-generation APIs are expected to be developed.
      There are at least four APIs (Direct3D11, Direct3D12, OpenGL/GLES, Vulkan, plus Apple's Metal for iOS and osX platforms) that a cross-platform 3D application may need to support. Writing separate code paths for all APIs is clearly not an option for any real-world application and the need for a cross-platform graphics abstraction layer is evident. The following is the list of requirements that I believe such layer needs to satisfy:
      Lightweight abstractions: the API should be as close to the underlying native APIs as possible to allow an application leverage all available low-level functionality. In many cases this requirement is difficult to achieve because specific features exposed by different APIs may vary considerably. Low performance overhead: the abstraction layer needs to be efficient from performance point of view. If it introduces considerable amount of overhead, there is no point in using it. Convenience: the API needs to be convenient to use. It needs to assist developers in achieving their goals not limiting their control of the graphics hardware. Multithreading: ability to efficiently parallelize work is in the core of Direct3D12 and Vulkan and one of the main selling points of the new APIs. Support for multithreading in a cross-platform layer is a must. Extensibility: no matter how well the API is designed, it still introduces some level of abstraction. In some cases the most efficient way to implement certain functionality is to directly use native API. The abstraction layer needs to provide seamless interoperability with the underlying native APIs to provide a way for the app to add features that may be missing. Diligent Engine is designed to solve these problems. Its main goal is to take advantages of the next-generation APIs such as Direct3D12 and Vulkan, but at the same time provide support for older platforms via Direct3D11, OpenGL and OpenGLES. Diligent Engine exposes common C++ front-end for all supported platforms and provides interoperability with underlying native APIs. It also supports integration with Unity and is designed to be used as graphics subsystem in a standalone game engine, Unity native plugin or any other 3D application. Full source code is available for download at GitHub and is free to use.
      Overview
      Diligent Engine API takes some features from Direct3D11 and Direct3D12 as well as introduces new concepts to hide certain platform-specific details and make the system easy to use. It contains the following main components:
      Render device (IRenderDevice  interface) is responsible for creating all other objects (textures, buffers, shaders, pipeline states, etc.).
      Device context (IDeviceContext interface) is the main interface for recording rendering commands. Similar to Direct3D11, there are immediate context and deferred contexts (which in Direct3D11 implementation map directly to the corresponding context types). Immediate context combines command queue and command list recording functionality. It records commands and submits the command list for execution when it contains sufficient number of commands. Deferred contexts are designed to only record command lists that can be submitted for execution through the immediate context.
      An alternative way to design the API would be to expose command queue and command lists directly. This approach however does not map well to Direct3D11 and OpenGL. Besides, some functionality (such as dynamic descriptor allocation) can be much more efficiently implemented when it is known that a command list is recorded by a certain deferred context from some thread.
      The approach taken in the engine does not limit scalability as the application is expected to create one deferred context per thread, and internally every deferred context records a command list in lock-free fashion. At the same time this approach maps well to older APIs.
      In current implementation, only one immediate context that uses default graphics command queue is created. To support multiple GPUs or multiple command queue types (compute, copy, etc.), it is natural to have one immediate contexts per queue. Cross-context synchronization utilities will be necessary.
      Swap Chain (ISwapChain interface). Swap chain interface represents a chain of back buffers and is responsible for showing the final rendered image on the screen.
      Render device, device contexts and swap chain are created during the engine initialization.
      Resources (ITexture and IBuffer interfaces). There are two types of resources - textures and buffers. There are many different texture types (2D textures, 3D textures, texture array, cubmepas, etc.) that can all be represented by ITexture interface.
      Resources Views (ITextureView and IBufferView interfaces). While textures and buffers are mere data containers, texture views and buffer views describe how the data should be interpreted. For instance, a 2D texture can be used as a render target for rendering commands or as a shader resource.
      Pipeline State (IPipelineState interface). GPU pipeline contains many configurable stages (depth-stencil, rasterizer and blend states, different shader stage, etc.). Direct3D11 uses coarse-grain objects to set all stage parameters at once (for instance, a rasterizer object encompasses all rasterizer attributes), while OpenGL contains myriad functions to fine-grain control every individual attribute of every stage. Both methods do not map very well to modern graphics hardware that combines all states into one monolithic state under the hood. Direct3D12 directly exposes pipeline state object in the API, and Diligent Engine uses the same approach.
      Shader Resource Binding (IShaderResourceBinding interface). Shaders are programs that run on the GPU. Shaders may access various resources (textures and buffers), and setting correspondence between shader variables and actual resources is called resource binding. Resource binding implementation varies considerably between different API. Diligent Engine introduces a new object called shader resource binding that encompasses all resources needed by all shaders in a certain pipeline state.
      API Basics
      Creating Resources
      Device resources are created by the render device. The two main resource types are buffers, which represent linear memory, and textures, which use memory layouts optimized for fast filtering. Graphics APIs usually have a native object that represents linear buffer. Diligent Engine uses IBuffer interface as an abstraction for a native buffer. To create a buffer, one needs to populate BufferDesc structure and call IRenderDevice::CreateBuffer() method as in the following example:
      BufferDesc BuffDesc; BufferDesc.Name = "Uniform buffer"; BuffDesc.BindFlags = BIND_UNIFORM_BUFFER; BuffDesc.Usage = USAGE_DYNAMIC; BuffDesc.uiSizeInBytes = sizeof(ShaderConstants); BuffDesc.CPUAccessFlags = CPU_ACCESS_WRITE; m_pDevice->CreateBuffer( BuffDesc, BufferData(), &m_pConstantBuffer ); While there is usually just one buffer object, different APIs use very different approaches to represent textures. For instance, in Direct3D11, there are ID3D11Texture1D, ID3D11Texture2D, and ID3D11Texture3D objects. In OpenGL, there is individual object for every texture dimension (1D, 2D, 3D, Cube), which may be a texture array, which may also be multisampled (i.e. GL_TEXTURE_2D_MULTISAMPLE_ARRAY). As a result there are nine different GL texture types that Diligent Engine may create under the hood. In Direct3D12, there is only one resource interface. Diligent Engine hides all these details in ITexture interface. There is only one  IRenderDevice::CreateTexture() method that is capable of creating all texture types. Dimension, format, array size and all other parameters are specified by the members of the TextureDesc structure:
      TextureDesc TexDesc; TexDesc.Name = "My texture 2D"; TexDesc.Type = TEXTURE_TYPE_2D; TexDesc.Width = 1024; TexDesc.Height = 1024; TexDesc.Format = TEX_FORMAT_RGBA8_UNORM; TexDesc.Usage = USAGE_DEFAULT; TexDesc.BindFlags = BIND_SHADER_RESOURCE | BIND_RENDER_TARGET | BIND_UNORDERED_ACCESS; TexDesc.Name = "Sample 2D Texture"; m_pRenderDevice->CreateTexture( TexDesc, TextureData(), &m_pTestTex ); If native API supports multithreaded resource creation, textures and buffers can be created by multiple threads simultaneously.
      Interoperability with native API provides access to the native buffer/texture objects and also allows creating Diligent Engine objects from native handles. It allows applications seamlessly integrate native API-specific code with Diligent Engine.
      Next-generation APIs allow fine level-control over how resources are allocated. Diligent Engine does not currently expose this functionality, but it can be added by implementing IResourceAllocator interface that encapsulates specifics of resource allocation and providing this interface to CreateBuffer() or CreateTexture() methods. If null is provided, default allocator should be used.
      Initializing the Pipeline State
      As it was mentioned earlier, Diligent Engine follows next-gen APIs to configure the graphics/compute pipeline. One big Pipelines State Object (PSO) encompasses all required states (all shader stages, input layout description, depth stencil, rasterizer and blend state descriptions etc.). This approach maps directly to Direct3D12/Vulkan, but is also beneficial for older APIs as it eliminates pipeline misconfiguration errors. With many individual calls tweaking various GPU pipeline settings it is very easy to forget to set one of the states or assume the stage is already properly configured when in fact it is not. Using pipeline state object helps avoid these problems as all stages are configured at once.
      Creating Shaders
      While in earlier APIs shaders were bound separately, in the next-generation APIs as well as in Diligent Engine shaders are part of the pipeline state object. The biggest challenge when authoring shaders is that Direct3D and OpenGL/Vulkan use different shader languages (while Apple uses yet another language in their Metal API). Maintaining two versions of every shader is not an option for real applications and Diligent Engine implements shader source code converter that allows shaders authored in HLSL to be translated to GLSL. To create a shader, one needs to populate ShaderCreationAttribs structure. SourceLanguage member of this structure tells the system which language the shader is authored in:
      SHADER_SOURCE_LANGUAGE_DEFAULT - The shader source language matches the underlying graphics API: HLSL for Direct3D11/Direct3D12 mode, and GLSL for OpenGL and OpenGLES modes. SHADER_SOURCE_LANGUAGE_HLSL - The shader source is in HLSL. For OpenGL and OpenGLES modes, the source code will be converted to GLSL. SHADER_SOURCE_LANGUAGE_GLSL - The shader source is in GLSL. There is currently no GLSL to HLSL converter, so this value should only be used for OpenGL and OpenGLES modes. There are two ways to provide the shader source code. The first way is to use Source member. The second way is to provide a file path in FilePath member. Since the engine is entirely decoupled from the platform and the host file system is platform-dependent, the structure exposes pShaderSourceStreamFactory member that is intended to provide the engine access to the file system. If FilePath is provided, shader source factory must also be provided. If the shader source contains any #include directives, the source stream factory will also be used to load these files. The engine provides default implementation for every supported platform that should be sufficient in most cases. Custom implementation can be provided when needed.
      When sampling a texture in a shader, the texture sampler was traditionally specified as separate object that was bound to the pipeline at run time or set as part of the texture object itself. However, in most cases it is known beforehand what kind of sampler will be used in the shader. Next-generation APIs expose new type of sampler called static sampler that can be initialized directly in the pipeline state. Diligent Engine exposes this functionality: when creating a shader, textures can be assigned static samplers. If static sampler is assigned, it will always be used instead of the one initialized in the texture shader resource view. To initialize static samplers, prepare an array of StaticSamplerDesc structures and initialize StaticSamplers and NumStaticSamplers members. Static samplers are more efficient and it is highly recommended to use them whenever possible. On older APIs, static samplers are emulated via generic sampler objects.
      The following is an example of shader initialization:
      ShaderCreationAttribs Attrs; Attrs.Desc.Name = "MyPixelShader"; Attrs.FilePath = "MyShaderFile.fx"; Attrs.SearchDirectories = "shaders;shaders\\inc;"; Attrs.EntryPoint = "MyPixelShader"; Attrs.Desc.ShaderType = SHADER_TYPE_PIXEL; Attrs.SourceLanguage = SHADER_SOURCE_LANGUAGE_HLSL; BasicShaderSourceStreamFactory BasicSSSFactory(Attrs.SearchDirectories); Attrs.pShaderSourceStreamFactory = &BasicSSSFactory; ShaderVariableDesc ShaderVars[] = {     {"g_StaticTexture", SHADER_VARIABLE_TYPE_STATIC},     {"g_MutableTexture", SHADER_VARIABLE_TYPE_MUTABLE},     {"g_DynamicTexture", SHADER_VARIABLE_TYPE_DYNAMIC} }; Attrs.Desc.VariableDesc = ShaderVars; Attrs.Desc.NumVariables = _countof(ShaderVars); Attrs.Desc.DefaultVariableType = SHADER_VARIABLE_TYPE_STATIC; StaticSamplerDesc StaticSampler; StaticSampler.Desc.MinFilter = FILTER_TYPE_LINEAR; StaticSampler.Desc.MagFilter = FILTER_TYPE_LINEAR; StaticSampler.Desc.MipFilter = FILTER_TYPE_LINEAR; StaticSampler.TextureName = "g_MutableTexture"; Attrs.Desc.NumStaticSamplers = 1; Attrs.Desc.StaticSamplers = &StaticSampler; ShaderMacroHelper Macros; Macros.AddShaderMacro("USE_SHADOWS", 1); Macros.AddShaderMacro("NUM_SHADOW_SAMPLES", 4); Macros.Finalize(); Attrs.Macros = Macros; RefCntAutoPtr<IShader> pShader; m_pDevice->CreateShader( Attrs, &pShader );
      Creating the Pipeline State Object
      After all required shaders are created, the rest of the fields of the PipelineStateDesc structure provide depth-stencil, rasterizer, and blend state descriptions, the number and format of render targets, input layout format, etc. For instance, rasterizer state can be described as follows:
      PipelineStateDesc PSODesc; RasterizerStateDesc &RasterizerDesc = PSODesc.GraphicsPipeline.RasterizerDesc; RasterizerDesc.FillMode = FILL_MODE_SOLID; RasterizerDesc.CullMode = CULL_MODE_NONE; RasterizerDesc.FrontCounterClockwise = True; RasterizerDesc.ScissorEnable = True; RasterizerDesc.AntialiasedLineEnable = False; Depth-stencil and blend states are defined in a similar fashion.
      Another important thing that pipeline state object encompasses is the input layout description that defines how inputs to the vertex shader, which is the very first shader stage, should be read from the memory. Input layout may define several vertex streams that contain values of different formats and sizes:
      // Define input layout InputLayoutDesc &Layout = PSODesc.GraphicsPipeline.InputLayout; LayoutElement TextLayoutElems[] = {     LayoutElement( 0, 0, 3, VT_FLOAT32, False ),     LayoutElement( 1, 0, 4, VT_UINT8, True ),     LayoutElement( 2, 0, 2, VT_FLOAT32, False ), }; Layout.LayoutElements = TextLayoutElems; Layout.NumElements = _countof( TextLayoutElems ); Finally, pipeline state defines primitive topology type. When all required members are initialized, a pipeline state object can be created by IRenderDevice::CreatePipelineState() method:
      // Define shader and primitive topology PSODesc.GraphicsPipeline.PrimitiveTopologyType = PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE; PSODesc.GraphicsPipeline.pVS = pVertexShader; PSODesc.GraphicsPipeline.pPS = pPixelShader; PSODesc.Name = "My pipeline state"; m_pDev->CreatePipelineState(PSODesc, &m_pPSO); When PSO object is bound to the pipeline, the engine invokes all API-specific commands to set all states specified by the object. In case of Direct3D12 this maps directly to setting the D3D12 PSO object. In case of Direct3D11, this involves setting individual state objects (such as rasterizer and blend states), shaders, input layout etc. In case of OpenGL, this requires a number of fine-grain state tweaking calls. Diligent Engine keeps track of currently bound states and only calls functions to update these states that have actually changed.
      Binding Shader Resources
      Direct3D11 and OpenGL utilize fine-grain resource binding models, where an application binds individual buffers and textures to certain shader or program resource binding slots. Direct3D12 uses a very different approach, where resource descriptors are grouped into tables, and an application can bind all resources in the table at once by setting the table in the command list. Resource binding model in Diligent Engine is designed to leverage this new method. It introduces a new object called shader resource binding that encapsulates all resource bindings required for all shaders in a certain pipeline state. It also introduces the classification of shader variables based on the frequency of expected change that helps the engine group them into tables under the hood:
      Static variables (SHADER_VARIABLE_TYPE_STATIC) are variables that are expected to be set only once. They may not be changed once a resource is bound to the variable. Such variables are intended to hold global constants such as camera attributes or global light attributes constant buffers. Mutable variables (SHADER_VARIABLE_TYPE_MUTABLE) define resources that are expected to change on a per-material frequency. Examples may include diffuse textures, normal maps etc. Dynamic variables (SHADER_VARIABLE_TYPE_DYNAMIC) are expected to change frequently and randomly. Shader variable type must be specified during shader creation by populating an array of ShaderVariableDesc structures and initializing ShaderCreationAttribs::Desc::VariableDesc and ShaderCreationAttribs::Desc::NumVariables members (see example of shader creation above).
      Static variables cannot be changed once a resource is bound to the variable. They are bound directly to the shader object. For instance, a shadow map texture is not expected to change after it is created, so it can be bound directly to the shader:
      PixelShader->GetShaderVariable( "g_tex2DShadowMap" )->Set( pShadowMapSRV ); Mutable and dynamic variables are bound via a new Shader Resource Binding object (SRB) that is created by the pipeline state (IPipelineState::CreateShaderResourceBinding()):
      m_pPSO->CreateShaderResourceBinding(&m_pSRB); Note that an SRB is only compatible with the pipeline state it was created from. SRB object inherits all static bindings from shaders in the pipeline, but is not allowed to change them.
      Mutable resources can only be set once for every instance of a shader resource binding. Such resources are intended to define specific material properties. For instance, a diffuse texture for a specific material is not expected to change once the material is defined and can be set right after the SRB object has been created:
      m_pSRB->GetVariable(SHADER_TYPE_PIXEL, "tex2DDiffuse")->Set(pDiffuseTexSRV); In some cases it is necessary to bind a new resource to a variable every time a draw command is invoked. Such variables should be labeled as dynamic, which will allow setting them multiple times through the same SRB object:
      m_pSRB->GetVariable(SHADER_TYPE_VERTEX, "cbRandomAttribs")->Set(pRandomAttrsCB); Under the hood, the engine pre-allocates descriptor tables for static and mutable resources when an SRB objcet is created. Space for dynamic resources is dynamically allocated at run time. Static and mutable resources are thus more efficient and should be used whenever possible.
      As you can see, Diligent Engine does not expose low-level details of how resources are bound to shader variables. One reason for this is that these details are very different for various APIs. The other reason is that using low-level binding methods is extremely error-prone: it is very easy to forget to bind some resource, or bind incorrect resource such as bind a buffer to the variable that is in fact a texture, especially during shader development when everything changes fast. Diligent Engine instead relies on shader reflection system to automatically query the list of all shader variables. Grouping variables based on three types mentioned above allows the engine to create optimized layout and take heavy lifting of matching resources to API-specific resource location, register or descriptor in the table.
      This post gives more details about the resource binding model in Diligent Engine.
      Setting the Pipeline State and Committing Shader Resources
      Before any draw or compute command can be invoked, the pipeline state needs to be bound to the context:
      m_pContext->SetPipelineState(m_pPSO); Under the hood, the engine sets the internal PSO object in the command list or calls all the required native API functions to properly configure all pipeline stages.
      The next step is to bind all required shader resources to the GPU pipeline, which is accomplished by IDeviceContext::CommitShaderResources() method:
      m_pContext->CommitShaderResources(m_pSRB, COMMIT_SHADER_RESOURCES_FLAG_TRANSITION_RESOURCES); The method takes a pointer to the shader resource binding object and makes all resources the object holds available for the shaders. In the case of D3D12, this only requires setting appropriate descriptor tables in the command list. For older APIs, this typically requires setting all resources individually.
      Next-generation APIs require the application to track the state of every resource and explicitly inform the system about all state transitions. For instance, if a texture was used as render target before, while the next draw command is going to use it as shader resource, a transition barrier needs to be executed. Diligent Engine does the heavy lifting of state tracking.  When CommitShaderResources() method is called with COMMIT_SHADER_RESOURCES_FLAG_TRANSITION_RESOURCES flag, the engine commits and transitions resources to correct states at the same time. Note that transitioning resources does introduce some overhead. The engine tracks state of every resource and it will not issue the barrier if the state is already correct. But checking resource state is an overhead that can sometimes be avoided. The engine provides IDeviceContext::TransitionShaderResources() method that only transitions resources:
      m_pContext->TransitionShaderResources(m_pPSO, m_pSRB); In some scenarios it is more efficient to transition resources once and then only commit them.
      Invoking Draw Command
      The final step is to set states that are not part of the PSO, such as render targets, vertex and index buffers. Diligent Engine uses Direct3D11-syle API that is translated to other native API calls under the hood:
      ITextureView *pRTVs[] = {m_pRTV}; m_pContext->SetRenderTargets(_countof( pRTVs ), pRTVs, m_pDSV); // Clear render target and depth buffer const float zero[4] = {0, 0, 0, 0}; m_pContext->ClearRenderTarget(nullptr, zero); m_pContext->ClearDepthStencil(nullptr, CLEAR_DEPTH_FLAG, 1.f); // Set vertex and index buffers IBuffer *buffer[] = {m_pVertexBuffer}; Uint32 offsets[] = {0}; Uint32 strides[] = {sizeof(MyVertex)}; m_pContext->SetVertexBuffers(0, 1, buffer, strides, offsets, SET_VERTEX_BUFFERS_FLAG_RESET); m_pContext->SetIndexBuffer(m_pIndexBuffer, 0); Different native APIs use various set of function to execute draw commands depending on command details (if the command is indexed, instanced or both, what offsets in the source buffers are used etc.). For instance, there are 5 draw commands in Direct3D11 and more than 9 commands in OpenGL with something like glDrawElementsInstancedBaseVertexBaseInstance not uncommon. Diligent Engine hides all details with single IDeviceContext::Draw() method that takes takes DrawAttribs structure as an argument. The structure members define all attributes required to perform the command (primitive topology, number of vertices or indices, if draw call is indexed or not, if draw call is instanced or not, if draw call is indirect or not, etc.). For example:
      DrawAttribs attrs; attrs.IsIndexed = true; attrs.IndexType = VT_UINT16; attrs.NumIndices = 36; attrs.Topology = PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; pContext->Draw(attrs); For compute commands, there is IDeviceContext::DispatchCompute() method that takes DispatchComputeAttribs structure that defines compute grid dimension.
      Source Code
      Full engine source code is available on GitHub and is free to use. The repository contains two samples, asteroids performance benchmark and example Unity project that uses Diligent Engine in native plugin.
      AntTweakBar sample is Diligent Engine’s “Hello World” example.

       
      Atmospheric scattering sample is a more advanced example. It demonstrates how Diligent Engine can be used to implement various rendering tasks: loading textures from files, using complex shaders, rendering to multiple render targets, using compute shaders and unordered access views, etc.

      Asteroids performance benchmark is based on this demo developed by Intel. It renders 50,000 unique textured asteroids and allows comparing performance of Direct3D11 and Direct3D12 implementations. Every asteroid is a combination of one of 1000 unique meshes and one of 10 unique textures.

      Finally, there is an example project that shows how Diligent Engine can be integrated with Unity.

      Future Work
      The engine is under active development. It currently supports Windows desktop, Universal Windows and Android platforms. Direct3D11, Direct3D12, OpenGL/GLES backends are now feature complete. Vulkan backend is coming next, and support for more platforms is planned.
    • By michaeldodis
      I've started building a small library, that can render pie menu GUI in legacy opengl, planning to add some traditional elements of course.
      It's interface is similar to something you'd see in IMGUI. It's written in C.
      Early version of the library
      I'd really love to hear anyone's thoughts on this, any suggestions on what features you'd want to see in a library like this? 
      Thanks in advance!
  • Popular Now