Display List vs VBO?

Started by
23 comments, last by python_regious 19 years, 8 months ago
Quote:Original post by Tree Penguin
DL: 120
VA: 115
VBO: 110

This also indicates that you are not transfer limited. You get almost same performance when using data from VRAM than form normal RAM. Render 10x as much and then post results. Also make sure you are not fillrate limited.
You should never let your fears become the boundaries of your dreams.
Advertisement
VBOs are surely a nice extension, but their implementation changes from driver to driver, and so the performance...

VBOs are a good way if you want to store your geometry on the card, so you can save a lot in AGP transfers.

i played around with them for some time now... what i experienced: you could gather a lot of performance when you optimize your data for specific vertex caches. caching is everything... also try to order your data as triangle strips, this will also result in a great speedup.

but VBOs often totally mess up the performance when switching to another graphics board, driver etc.

if you have a total static geometry i would recommend (for now) another solution:

use a compiled display list together with standard vertex arrays (with optimizations mentioned above) and you will have a good + stable performance. (i'm really satisfied with it in our project)

------------
speculation: displaylists + VA are handled driver-internal as VBOs. i dunno if this is right, just my idea (due to the nice performance). another plus: you don't have to mess around with vbo extensions and fallback code.
------------

my 2c,

thomas
Ok, thanks for the replies!

Rendering 20 times as much geometry still had the same results (DL was still a little faster than VBOs), i was fillrate limited, i assume scaling what you draw to about 10x10 pixels fixes that.

VAs inside DLs was just as fast as glVertex and glTexCoord calls inside the DL (i think that must be my gfx card, i wil try it on the radeon later this week).

Anyway, loading a VBO is way faster than loading a DL containing the same data, so i'll stick with VBO for certain purposes i guess.

I will look up the NVidia papers and some others probably too.

Thanks for your help everyone.
Quote:Original post by Tree Penguin
VAs inside DLs was just as fast as glVertex and glTexCoord calls inside the DL (i think that must be my gfx card, i wil try it on the radeon later this week).


I haven't optimized my engine with DL's yet, so I might be wrong. But I could swear that glVertexPointer and such VA calls aren't compiled into a DL.
Quote:Original post by okonomiyaki
Quote:Original post by Tree Penguin
VAs inside DLs was just as fast as glVertex and glTexCoord calls inside the DL (i think that must be my gfx card, i wil try it on the radeon later this week).


I haven't optimized my engine with DL's yet, so I might be wrong. But I could swear that glVertexPointer and such VA calls aren't compiled into a DL.


I tried clearing (i set every value to 0.0f, to make sure the driver can't use that data anymore) and deleting the data after compiling the display list and it works fine so either the driver made a copy of the data in system memory (i don't think so) or it's placed in VRAM.
Quote:
I tried clearing (i set every value to 0.0f, to make sure the driver can't use that data anymore) and deleting the data after compiling the display list and it works fine so either the driver made a copy of the data in system memory (i don't think so) or it's placed in VRAM.


You may be right, like I said, I haven't played around with DL's extensively yet. It's strange how that still worked. This is straight from the red book:

Quote:
Certain commands, when called while compiling a display list, are not compiled
into the display list but are executed immediately. These are: GenLists,
DeleteLists, FeedbackBuffer, SelectBuffer, RenderMode, ColorPointer, Fog-
CoordPointer, EdgeFlagPointer, IndexPointer, NormalPointer, TexCoord-
Pointer, SecondaryColorPointer, VertexPointer, ClientActiveTexture, InterleavedArrays,
EnableClientState, DisableClientState, PushClientAttrib, Pop-
ClientAttrib, ReadPixels, PixelStore, GenTextures, DeleteTextures, AreTexturesResident,
GenQueries, DeleteQueries, BindBuffer, DeleteBuffers, Gen-
Buffers, BufferData, BufferSubData, MapBuffer, UnmapBuffer, Flush, Finish,
as well as all of the Get and Is commands (see Chapter 6).


I would like to clear this up so that I can understand how and where to implement DL's.
Strange... i fear it's driver specific. If so, that sucks, that would mean to get optimal performance you should test every possible way and see what's fastest, as it might save you 50%.
Quote:I haven't optimized my engine with DL's yet, so I might be wrong. But I could swear that glVertexPointer and such VA calls aren't compiled into a DL.

It's compiled when you draw the VA using glDrawArrays or other "draw" calls. At wich time the data is copied, and the DL will not use the pointer after compilation.

In theory DL could be faster than VBO's becuase it can do whatever optimizations it want's at compile time. It really depends on how good the drivers are.

Quote:
It's compiled when you draw the VA using glDrawArrays or other "draw" calls. At wich time the data is copied, and the DL will not use the pointer after compilation.


Ah, that makes sense. Cool. So I can upload data without VBO's.
I think I'll implement VBO's and compiled vertex arrays, and let the user choose which (VBO as default, because I still have more faith in them). But if the user finds better performance with vertex arrays, well, I don't want to stop them from choosing it.

If you delete a compiled DL, does that mean that it deletes all the copied data too?
Yes, all data is compiled into the display list so when you delete the dl the data is deleted too.

I think letting a gamer decide to use VBOs or VAs in DLs will be unwise, i think most of them won't even know what they are. I think you should choose the fastest, checking which one is the fastest at startup.

This topic is closed to new replies.

Advertisement