Jump to content

  • Log In with Google      Sign In   
  • Create Account


wglChoosePixelFormatARB for MSAA very slow


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
15 replies to this topic

#1 IceBreaker23   Members   -  Reputation: 601

Like
0Likes
Like

Posted 23 March 2014 - 10:28 AM

I am using the tutorial from NeHe Productions http://nehe.gamedev.net/tutorial/fullscreen_antialiasing/16008/ on how to enable MSAA on a gl context.

 

It all works fine, but the funciton call to wglChoosePixelFormatARB() is very, very slow.

Usually takes about 2-3 second(my pc is a gaming pc, so it should run muchf faster).

 

Does anyone know why this function is so slow and is there a workaround/solution?



Sponsor:

#2 mark ds   Members   -  Reputation: 1088

Like
0Likes
Like

Posted 23 March 2014 - 12:04 PM

What graphics card are you using, and are the drivers up to date? Also, do you have a small piece of code that emulates it - I'll happily run it my system and report the timing.


Edited by mark ds, 23 March 2014 - 05:34 PM.


#3 mhagain   Crossbones+   -  Reputation: 7467

Like
2Likes
Like

Posted 23 March 2014 - 12:58 PM

If you review the spec for WGL_ARB_pixel_format you'll see that it makes no promises about speed, and notes that the returned list of valid formats is going to be in a device-dependent order.  That tells you that the driver must (1) go through each supported pixel format, (2) test it to see if it meets your criteria (and GL doesn't specify how this test is done), (3) assign a score to it, and (4) sort the list by this score to return them.

 

The key one here is (2) test it to see if it meets your criteria - becuase GL doesn't specify what tests are done, how they're done, etc, drivers are perfectly free to do anything up to and including a mode change to see if a format is good.  Having a "gaming PC" shouldn't give you an expectation that this process is going to be fast: a mode change is always going to be slow, even if it's not a mode change it may be something else extensive (and remember - GL doesn't specify so it's device-dependent), and you may even have a lot of formats that match your criteria, so that's a lot of formats that must be tested.


It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.


#4 IceBreaker23   Members   -  Reputation: 601

Like
0Likes
Like

Posted 23 March 2014 - 01:26 PM

What do you mean by "what graphics are you using"?

 

I updated my graphics card driver and it didnt change anything in the time.

 

I have no small demo unfortunately its all incorporated in my engine. I just let a friend test it on his computer and there without Visual Studio it takes about 1-2 seconds(a bit faster than when i run it from visual studio).

 

EDIT: Just read the second answer. Is there any workaround for this?


Edited by IceBreaker23, 23 March 2014 - 01:27 PM.


#5 mhagain   Crossbones+   -  Reputation: 7467

Like
2Likes
Like

Posted 23 March 2014 - 03:43 PM

What do you mean by "what graphics are you using"?

 

I updated my graphics card driver and it didnt change anything in the time.

 

NVIDIA?  AMD?  Intel?  Something else?  And which model?

 

I have no small demo unfortunately its all incorporated in my engine. I just let a friend test it on his computer and there without Visual Studio it takes about 1-2 seconds(a bit faster than when i run it from visual studio).

 

EDIT: Just read the second answer. Is there any workaround for this?

 

I'm not aware of one but I'm not aware that one is necessary.  Setting a display mode isn't a performance-critical part of any engine.  If you're changing display modes so often that it becomes one, you have a design problem that you should fix.


It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.


#6 IceBreaker23   Members   -  Reputation: 601

Like
0Likes
Like

Posted 23 March 2014 - 04:08 PM

I have an NVIDIA GeForce GTX 560 Ti.

Well yes its not a performance-critical part of my engine, yet it bugs me to have this delay in there after starting...



#7 mark ds   Members   -  Reputation: 1088

Like
0Likes
Like

Posted 23 March 2014 - 05:40 PM

Sorry - I meant graphics card. It does seem unusually slow - as mhagain suggested, you're probably picking a pixel format that returns a large number of potential matches.

 

Are you specifying WGL_FULL_ACCELERATION_ARB, WGL_SWAP_EXCHANGE_ARB, WGL_TYPE_RGBA_ARB etc? These should reduce the number of matches.



#8 IceBreaker23   Members   -  Reputation: 601

Like
0Likes
Like

Posted 24 March 2014 - 03:45 AM

Adding

WGL_SWAP_EXCHANGE_ARB,WGL_SWAP_EXCHANGE_ARB,
WGL_PIXEL_TYPE_ARB, WGL_TYPE_RGBA_ARB,
to the iAttributes array actually makes the whole thing run slower than before.
 
Here's the full iAttributes array without the 2 additions:
int iAttributes[] =
	{
		WGL_DRAW_TO_WINDOW_ARB,GL_TRUE,
		WGL_SUPPORT_OPENGL_ARB,GL_TRUE,
		WGL_ACCELERATION_ARB,WGL_FULL_ACCELERATION_ARB,
		WGL_COLOR_BITS_ARB,24,
		WGL_ALPHA_BITS_ARB,8,
		WGL_DEPTH_BITS_ARB,16,
		WGL_STENCIL_BITS_ARB,0,
		WGL_DOUBLE_BUFFER_ARB,GL_TRUE,
		WGL_SAMPLE_BUFFERS_ARB,1,
		WGL_SAMPLES_ARB,16,

		0,0
	};


#9 samoth   Crossbones+   -  Reputation: 4532

Like
0Likes
Like

Posted 24 March 2014 - 05:28 AM

There is no comprehensible reason why it is so excessively slow other than the reason given by mhagain: the specification makes no promises about speed.

 

Typically, the implementation will have around 150-200 formats available, so in order to take 2-3 seconds, it would have to spend an entire 10-15 milliseconds matching a single format. For comparing a dozen or so values against a list of capabilities, that is... a... lot. Even if you add half a millisecond on top to sort the list of 10-12 formats that match the minimum required specs.

 

My suspicion is that it's slow not only because the task is complicated, but also because it loads DLLs and fires up the shader compiler, etc. At least on my system, the second call is significantly faster than the first. Also, merely enumerating (without making the driver choose one) formats is slow already. On my system, the driver always chooses the first hit, too. Of course it may look different on a different system.

 

You can make it slightly faster if you use GetPixelFormat instead of ChoosePixelFormat to enumerate them one by one, look at the caps, and stop as soon as you've found one that you're happy with. It's still quite slow, however.

Caching the format's integer identifier (would have to verify at the next run though, in case hardware or driver gets updated) is something I haven't tried, but this might actually work. Other than that, I'm afraid you'll simply have to live with the fact that it takes a moment to come up.


Edited by samoth, 24 March 2014 - 05:29 AM.


#10 mark ds   Members   -  Reputation: 1088

Like
0Likes
Like

Posted 24 March 2014 - 07:00 AM

My 560Ti, driver 335.23, takes 4.5ms in debug build and 3.4ms in release build using the attributes you posted above.

 

Also, the nNumFormats variable in wglChoosePixelFormatARB returns 4 matches.

 

 

 

Edit to add that those timings are after rebooting my machine.


Edited by mark ds, 24 March 2014 - 07:10 AM.


#11 samoth   Crossbones+   -  Reputation: 4532

Like
0Likes
Like

Posted 24 March 2014 - 08:25 AM

3.4ms in release build
That is a figure that I would expect, to be honest! "Few milliseconds at most" is the answer I'd shout out if I didn't know otherwise. It certainly can't take much longer, actually (yet... it does).

 

Though not as horrible as stated by the OP's case, my machine (incidentially very similar, GTX650Ti / 332.21, but used to be the same on GT9800 with 177.x drivers) takes clearly noticeable time just to bring up the window.

 

I haven't benchmarked it to the millisecond since there's not much to do about it anyway (it takes what it takes!), but I'd guess it's something between half a second and one second here, just to initialize the context and pop up the window.



#12 mark ds   Members   -  Reputation: 1088

Like
0Likes
Like

Posted 24 March 2014 - 09:45 AM

 

Though not as horrible as stated by the OP's case, my machine (incidentially very similar, GTX650Ti / 332.21, but used to be the same on GT9800 with 177.x drivers) takes clearly noticeable time just to bring up the window.

 

 

For completeness, for me to create a dummy window, initialise glew, destroy the window, create a new window, do the usual GL setup, show the window and call swapbuffers takes ~400ms from a reboot. That includes some fairy small allocations, too. It takes about 300ms on the second launch. I launched the exe from Explorer, not the IDE.

 

IceBreaker, if you want to post code, I'm happy to try it. Either something else is going on, or maybe you're timing is in error?


Edited by mark ds, 24 March 2014 - 09:46 AM.


#13 mhagain   Crossbones+   -  Reputation: 7467

Like
1Likes
Like

Posted 24 March 2014 - 12:04 PM

Try taking this:

		WGL_DEPTH_BITS_ARB,16,
		WGL_STENCIL_BITS_ARB,0,

And switching to:

		WGL_DEPTH_BITS_ARB,24,
		WGL_STENCIL_BITS_ARB,8,

Or:

		WGL_DEPTH_BITS_ARB,32,
		WGL_STENCIL_BITS_ARB,0,

The reason why is that this switch will greatly narrow down the number of pixel formats that are going to pass selection.  Again, looking at the spec we see the following note:

 

 

Some attribute values must match the pixel format value exactly when the attribute is specified while others specify a minimum criteria, meaning that the pixel format value must meet or exceed the specified value.

 

And in the table following this we see that depth bits and stencil bits are two that must meet or exceed the specified values.  So in other words, these operate as a "greater than or equal to" option; so by specifying D16 we're also going to get all of the D24 formats, the D24S8 formats, the D32 formats, etc.

 

All reasonably modern hardware (i.e in the order of no more than 8 years old) will support a 24-bit depth buffer with 8-bits stencil, so this is an absolutely safe option to choose.


It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.


#14 mark ds   Members   -  Reputation: 1088

Like
0Likes
Like

Posted 24 March 2014 - 07:45 PM

Just an idea... Did you create a dummy window to get the function pointers, and then destroy it? And then use a new window to get the pixel formats?



#15 Fiddler   Members   -  Reputation: 746

Like
0Likes
Like

Posted 26 March 2014 - 03:37 AM

I have encountered this issue several times over the past 8 years while working on OpenTK. It only appears to affect Nvidia cards, and I haven't managed to locate the exact causes. Changing driver versions tends to make this go away.

 

Suggestions:

  • Use apitrace to capture a log of WGL / GL calls on the slow system, and compare that to a fast system. Upload the log somewhere and post a link here - we might be able to uncover something.
  • Call LoadLibrary("opengl32.dll") before invoking any WGL or GDI functions (including creating a window or querying pixel formats). This appears to help.
  • For functions that have both GDI and WGL prototypes, use the GDI one. This requires point 2 above, otherwise you program may crash due to weird historical reasons.
  • Edit: when you create a temporary context to load WGL extensions, keep the temp context alive for the wglChoosePixelFormatARB call. This may or may not have an effect here, but it will certainly fix random crashes on specific Intel drivers.
  • Try flipping the "Threaded Optimizations" option in the Nvidia control panel. Yes, sometimes this does make a difference...
  • If everything else fails, contact Nvidia with a test case that reproduces the issue.

I would really love to get to the bottom of this, but reports are so sporadic that it's really hard to pin down the causes.


Edited by Fiddler, 26 March 2014 - 03:44 AM.

[OpenTK: C# OpenGL 4.4, OpenGL ES 3.0 and OpenAL 1.1. Now with Linux/KMS support!]


#16 IceBreaker23   Members   -  Reputation: 601

Like
0Likes
Like

Posted 26 March 2014 - 01:44 PM

It kinda seems to have fixed itself. It's now much faster although I didn't change anything on my system...Still takes its time, but not that bad anymore.

I tried adding LoadLibrary() but that didnt change much.

 

Thanks for your help anyway I think I can live with this half a second at startup.






Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS