Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!


1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


OpenGL State Change Benchmark Data?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
15 replies to this topic

#1 Steve132   Members   -  Reputation: 433

Like
0Likes
Like

Posted 26 September 2006 - 02:23 PM

I haven't been on this forum in a very long time...so I dont' know how much is kept up around here, but anyway... I had a relativeliy advanced question for you all.. It is well understood that OpenGL is a state machine, and to manage it, you run state changes on the machine to manipulate the current render state as geometry passes through... In addition, certain state changes are inherently more expensive than others...for example, it is more expensive to change the current bound texture than it is to change the current render color...the same goes for changing the clear color vs turning off the depth buffer, etc. ANYWAY. My point is this... How would I go about coming up with the actual state change approximate costs for each state change operation in opengl? Is there a spec with this data? For example, a change to the current color might be cost 1x, whereas a change to the texture might be 30x....is there some table somewhere giving costs for state change operations? This might be implementation dependant, if so, then is there a general table with approximations? What might be helpful even more is this...can anyone think of a way to collect this data imperically? perhaps an opengl state benchmark program? How would such a program be written? Write me back.

Sponsor:

#2 Redien   Members   -  Reputation: 122

Like
0Likes
Like

Posted 26 September 2006 - 08:08 PM

I don't know if there really is a significant change in speed by changing different states.
( matrix changes might be slower than other changes )
But that's just my thoughts. I don't know.

But anyway, shouldn't gDEBugger or some other OpenGL debugger be able to do this?

#3 T1Oracle   Members   -  Reputation: 100

Like
0Likes
Like

Posted 26 September 2006 - 09:57 PM

I've read something that suggested that the answer to your question is simply: don't worry about it, states are optimized already and what happens in the hardware is constantly improving.

At least that is how I took this rant on why scene graphs are a waste of time.

Of course that is only one source and I've yet to see a similar sentiment anywhere else.
Programming since 1995.

#4 lc_overlord   Members   -  Reputation: 436

Like
0Likes
Like

Posted 27 September 2006 - 01:47 AM

State changes can be put into 3 or 4 different groups

First group and by far the slowest group is where the driver has to do a lot of validation work, it has to check weither there actually is data there and is it ok and so on.
this group includes texture changes and simmilar.

Then theres the middle group, shure it takes some processing to work things out it's relatively fast, so fast they could be ignored, just don't go state trashing with them ok.
This group includes most other states.

Finally theres a group that really are not state changes, they just behave like them, glColor is a good example, in reallity it's just a variable that everytime you call glVertex it's color value gets sent to the GPU, and no real processing is required.

There is a fourth group to that eats a lot of processor time, though they are not technically state changes, but they can stall the CPU or the GPU and that is so very bad.
This group includes functions like glFlush, glReadPixels and so on.

Well this is at least how it looks today, but comming in openGL 3.0 we have a new object model, it makes textures be validated on texture load and not everytime you want to use them, thus speeding up most state changes.
So in the future, state changes are not so much of a bother as it used to be.

#5 Steve132   Members   -  Reputation: 433

Like
0Likes
Like

Posted 27 September 2006 - 08:10 AM

lc_overlord...Thanks..those 4 state groups...is there a practical way to calculate what states go where? Or perhaps you could point me to a list or way of figuring it out for myself. That would be very helpful to me.

#6 lc_overlord   Members   -  Reputation: 436

Like
0Likes
Like

Posted 27 September 2006 - 10:18 AM

no, generally only glBindTexture, glAccum, glClear, some of the FBO and stuff and most texture operations is in the first group.
glColor, glTexcoord, glNormal, glFogCoord and such are in the third group.
the rest is in the second one, some in the second group can be in the gray area between group one and two, though most of this varies between harware and software implementations and thus it's a bit hard to say for sure.
There is this new extention from nvidia that can mesure time in picoseconds that can preform these tests but it's only for the latest quadro cards.

Anyway the general idea is to keep glBindTexture calls to a minimum and let shaders do all the hard work, then it doesn't matter how fast the statechanges are.

#7 Krohm   Crossbones+   -  Reputation: 3417

Like
0Likes
Like

Posted 27 September 2006 - 07:59 PM

I originally began a research of this kind but I later realized the information was often unreliable and prone to change between drivers (not even hardware!).

A thing I noticed however is that the cost of something is in some way related to the workload: I guess this does not surprise you!

On some nVIDIA papers it is stated that the more you go down in the hardware, the less state change is expensive. I guess this has to do with the fact there's often more validation on the 'near end' than on the frame buffer output (consider BlendFunc against VertexAttribPointer: it does make sense).

I am also confident benchmarking this is of little pratical use.


#8 lc_overlord   Members   -  Reputation: 436

Like
0Likes
Like

Posted 27 September 2006 - 10:53 PM

Quote:
Original post by Krohm
On some nVIDIA papers it is stated that the more you go down in the hardware, the less state change is expensive. I guess this has to do with the fact there's often more validation on the 'near end' than on the frame buffer output (consider BlendFunc against VertexAttribPointer: it does make sense).


Yea, that is why the new object model in opengl 3.0 will be faster since most of the validation work is already done.

Theoretically even with todays hardware one could write a API that use virtual texturing (that is, not having to bind textures, you just use them any way you like).



#9 ivarbug   Members   -  Reputation: 100

Like
0Likes
Like

Posted 27 September 2006 - 11:16 PM

I don't know any specs with actual opengl state change costs. I don't think that the relative costs of state changes are really important. Because usually there is no alternative state change(s) to a state change. It doesn't matter if it's in group1, group2 or gropu3. We just have to write optimized code and avoid as much unnecessary state changes as we can. It is - we can't replace glBindTexture with something cheaper. I mean I can't :p

#10 lc_overlord   Members   -  Reputation: 436

Like
0Likes
Like

Posted 27 September 2006 - 11:25 PM

no, currently we can't directly replace glBindTexture, i think we will within a year or so, but today we can work around it by using things like multitexturing together with shaders and optimizing the usage of texture state changes.

#11 ivarbug   Members   -  Reputation: 100

Like
0Likes
Like

Posted 28 September 2006 - 08:16 AM

Quote:
Original post by Steve132
How would such a program be written?

I tried calling 100 - 100000 same functions and using glFinish after loop. But I don't think that glFinish has any effect on glBindTexture. I Only found out that: glBindTexture is 15x - 300x slower than glColor and that opengl optimizes many things for you.

#12 Steve132   Members   -  Reputation: 433

Like
0Likes
Like

Posted 28 September 2006 - 02:39 PM

Quote:
There is this new extention from nvidia that can mesure time in picoseconds that can preform these tests but it's only for the latest quadro cards.


Thanks for all the help! So far this is all very useful. I may even be able to do what I wanted just based on what you said.... Out of curiosity however, what is this extension called? I have a mobile quadro card in my laptop, and it may be too old, but I would like to look into this, but couldn't find it in the extension registry.

#13 lc_overlord   Members   -  Reputation: 436

Like
0Likes
Like

Posted 29 September 2006 - 01:09 AM

It's called GL_NV_timer_query or GL_EXT_timer_query, it's not in the registry because
1. the GL_NV_timer_query version is sort of a public beta of a sorts, the real one will be called GL_EXT_timer_query
2. the registry is not known for updating often

A link to the spec might perhaps be in order.

O and it's in nanoseconds nothing else i might have mentioned, i think that was from when one of the devs just let it slip that picoseconds will be possible to do in the future and maybe this extension should be able to do that when that day comes.

#14 Steve132   Members   -  Reputation: 433

Like
0Likes
Like

Posted 01 October 2006 - 10:21 AM

Wow, cool! I actually have this extension in my card! Cool! Anywho...ok, well, thanks so much...would anyone be interested in a benchmark dataset or an app to try this on their own card? I intend on writing such an app, but it might be nice to have help...perhaps I can start with an open-source graphics test suite or something. Anyone have any thoughts?

#15 RichardS   Members   -  Reputation: 298

Like
0Likes
Like

Posted 01 October 2006 - 10:39 AM

If you want to benchmark these things, you need to be aware that practically every major implementaiton attempts to defer state management until the state is actually needed (ie, you actually draw something).

If you just put a whole bunch of glBindTextures in a loop, it will be quite fast, because basically no work is being done. It will actually be glDrawArrays (or glBegin, or whatever) that takes longer to execute.

The following is a rough grouping of state changes from slowest to fastest. Depending on the given implementation, the list may shift around a bit, but this is a reasonable average.

Program binding
FBO binding
Texture binding
Vertex array specification
Buffer binding
glUniform*
Update current vertex state (glColor, glVertexAttrib, etc).

#16 Steve132   Members   -  Reputation: 433

Like
0Likes
Like

Posted 03 October 2006 - 12:16 PM

I knew that much, about the rendering sequence, but thanks

As for the other grouping...thank you for that..it is very useful.





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS