Practicality of index buffers

Started by
21 comments, last by L. Spiro 12 years, 7 months ago
I just spent a good 10 minutes waiting for my engine's new OBJ model loader to compile the Stanford rabbit into a VBO and index buffer to use for rendering. Actual parsing of the raw OBJ data took a second or two, while the rest of the time was spent trying to go through every listing in the "faces" section of the file looking for duplicates, and substituting that vertex with an index in the index buffer where appropriate. The final result works perfectly...except that there isn't a single damn difference between the "raw" vertex count in the OBJ and the produced index buffer's size after compilation. I've been looking around at a few model formats recently, and considered rolling my own and I was starting to wonder just HOW often index buffers are useful in the real world. This 10 minute trial I just experienced kind of reinforces my suspicions that index buffers aren't commonly applicable to model data (unless you're dealing with the tutorial-style cube that is typically used to demonstrate their efficiency).

Anyone care to shed some light on this for me? Am I wasting my time implementing Index Buffers? Am I (most likely) wasting my time processing these models to try to discover shared vertices? What's the deal here?
Advertisement
I've run Doom 3 through GL Intercept and every single draw call in it is glDrawElements - it's index buffers all the way. Admittedly it is an old game, but it should suffice as one possible real world example.

It might be the case that the model you used was already heavily optimized and as such would not benefit. It might also be the case that you may need to double-check your vertex elimination code ( ;) ), but either way there are plenty of other cases where index buffers are needed and/or useful. Even at quite a mundane level - take the classic example of a particle system; assuming that we're on D3D (or an OpenGL core context) and that we don't want to use point sprites or a geometry shader, an index buffer is the best way of grouping multiple particles into a single draw call. That's just one example; a less contrived one may be GUI elements.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

Firstly; particles are a terrible example of why you want an index buffer, more so when done via points and geometery shader based quads.

The reason for using index buffers is down to vertex cache reuse on the hardware side.

If you have a model where a large number of vertices are shared then instead of wasting time re-computing the information you can do it once and pull it from cache (assuming it is still alive); the only way for this information to be passed across is via an index buffer.

Particles do NOT gain any benfit here as there are no shared vertices thus no cache reuse; an index buffer for particle drawing is, more than likely, a waste of time.
(Well, unless your particles are sparcley populating the vertex buffer, in which case you'd need the index buffer to hop around it).

Also, from the OP's post; typically the 'box tutorial' example shows a BAD case for indices because for any hard edges you need a unique vertex at each corner of a cube anyway so you get no reuse.

Chances are the stanford rabbit is already well optimised thus why you didn't see any changes in the vertex data count when importing/converting it.

Index buffers are worth it when you have significant vertex reuse, which is typical of most complex models in any given game scene.
Alright, just wanted to check and make sure I wasn't crazy for what I'm seeing :)

Thanks guys
I just spent a good 10 minutes waiting for my engine's new OBJ model loader ... the time was spent trying to go through every listing in the "faces" section of the file looking for duplicates, and substituting that vertex with an index in the index buffer where appropriate
Don't do this at run-time. The model file is static -- you make it once and ship it to the user -- so don't make the user run an optimization algorithm over that data each time they load the game. Instead, you should run that algorithm once when you're creating the data.

i.e. don't distribute OBJ files with your game. Export to OBJ from your art program, and then convert these OBJ files into an engine-specific format.

[quote name='ChugginWindex' timestamp='1314834462' post='4856063']I just spent a good 10 minutes waiting for my engine's new OBJ model loader ... the time was spent trying to go through every listing in the "faces" section of the file looking for duplicates, and substituting that vertex with an index in the index buffer where appropriate
Don't do this at run-time. The model file is static -- you make it once and ship it to the user -- so don't make the user run an optimization algorithm over that data each time they load the game. Instead, you should run that algorithm once when you're creating the data.

i.e. don't distribute OBJ files with your game. Export to OBJ from your art program, and then convert these OBJ files into an engine-specific format.
[/quote]

That was my plan eventually. This was more of a test-run of my model loading system and since it was fairly simple to code, I thought I'd throw in a quick little check to see if a vertex already existed so that I could do an index substitution if need be. I had no idea it was going to take so long. However even as a proper utility separate from the engine or framework that's still a very long time for one model IMHO which got me thinking about the layout of the vertices in the file and the possibility that perhaps index buffers weren't used as commonly as I was lead to believe (which I'm now hearing is not the case ;) )
Just to be sure I am reading your post right.
You start with X vertices and you end with the same number of indices?
[font="arial, verdana, tahoma, sans-serif"]
there isn't a single damn difference between the "raw" vertex count in the OBJ and the produced index buffer's size after compilation.


[/font][font="arial, verdana, tahoma, sans-serif"]There should’t be. Assuming triangle lists, you should end up with around half the number of vertices[/font][font="arial, verdana, tahoma, sans-serif"], and the same number of indices as there originally was vertices.[/font]
[font="arial, verdana, tahoma, sans-serif"] [/font]
[font="arial, verdana, tahoma, sans-serif"]Also, even though you plan to move this conversion offline, you can speed up the process by using a map to store scanned vertices. That is, for each vertex you check, you should be able to perform a binary search for its duplicate, if any. By your time, I can guess you are just searching from the start of the list every time.[/font]
[font="arial, verdana, tahoma, sans-serif"] [/font]
[font="arial, verdana, tahoma, sans-serif"]Also, since the map will be resized frequently, you actually don’t want to copy the vertices into it. Instead, keep a vector of the copied vertices that only accept vertices at its end, and use the map to sort indices[/font][font="arial, verdana, tahoma, sans-serif"] into that vector. This way the map only contains integers that are fairly trivial to move around as objects are inserted into the map.[/font]
[font="arial, verdana, tahoma, sans-serif"] [/font]
[font="arial, verdana, tahoma, sans-serif"]When you have done this process correctly, you should see a heavy gain in performance over using vertex buffers alone. Remember to chop the unused vertices off the end of the vertex buffer.[/font][font="arial, verdana, tahoma, sans-serif"] [/font]


[font="arial, verdana, tahoma, sans-serif"] [/font][font="arial, verdana, tahoma, sans-serif"] [/font][font="arial, verdana, tahoma, sans-serif"]L. Spiro[/font]

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid


[font="arial, verdana, tahoma, sans-serif"]Also, since the map will be resized frequently, you actually don’t want to copy the vertices into it. Instead, keep a vector of the copied vertices that only accept vertices at its end, and use the map to sort indices[/font][font="arial, verdana, tahoma, sans-serif"] into that vector. This way the map only contains integers that are fairly trivial to move around as objects are inserted into the map.[/font]


I'm not sure I follow your meaning here. I have a vector that holds the unique vertices and only accepts new ones at its end, but seeing as a map stores relationships I don't quite know what you mean by sorting indices into my unique vertices array. The idea is to put in a key and get back the index of the existing vertex (or null/std::end/bad value) as the value, and that mapping will save me time, correct? This makes sense, but what am I keying on? Keying on the vertex structure will give me just as much of a memory footprint as doing it the way I am now (albeit more quickly) as the map will still be storing the vertices as all of those keys, and the alternatives such as keying on the face vertex's string representation in the model ("v/uv/vn" for instance) or some other hashed representation will probably increase access time as well.
The way I have done it is to unpack the vertices into a vertex buffer in triangle-list format. The vertex buffer could be used in this form to render, but of course we want to eliminate duplicates so I add an index buffer.
I believe you also reach this point.


From here I want to search for duplicates but of course I want the search to be binary, so I suggested a map above.
The naive way to implement this would be to store a copy of each vertex in the map itself (vertex = key, value = that vertex’s index in the no-duplicate list). This is naive because the map will constantly be shuffling data around, and the larger the keys the more memory it has to copy/move.


In order to minimize the amount of data needed to move each time the map has a value inserted into it, we want to use references to vertices, rather than vertices themselves, for the keys.

My structure looks like this:

/**
* A unique element in a vertex buffer.
*/
typedef struct LSG_UNIQUE_VB_ELEMENT {
/**
* Pointer to the element.
*/
LSVOID * pvElement;

/**
* Size of the element.
*/
LSUINT32 ui32Size;


// == Operators.
/**
* Less-than operator.
*
* \param _uveOther The object against which to check for being less than.
* \return Returns true if this element is less than the given element.
*/
LSBOOL LSE_CALL operator < ( const LSG_UNIQUE_VB_ELEMENT &_uveOther ) const;

/**
* Equality check.
*
* \param _uveOther The object against which to check for being equal to.
* \return Returns true if this element is equal to the given element.
*/
LSBOOL LSE_CALL operator == ( const LSG_UNIQUE_VB_ELEMENT &_uveOther ) const;
} * LPLSG_UNIQUE_VB_ELEMENT, * const LPCLSG_UNIQUE_VB_ELEMENT;


A simple memcmp() is all that is needed to check if a key is less than or equal to another key.


Notice that I have a pointer to the vertex rather than the vertex itself. This acts as a smaller key and sorts much faster.

Now as you add vertices to your no-duplicates buffer, you can store a pointer to the vertex within that buffer (assuming the buffer has been pre-allocated so that the data within it never moves) as the key and the index of the vertex as the value.

Your original vertex buffer and your no-duplicate vertex buffer can be the same thing.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

Okay, that clears some of it up, but I'm still hung up on a couple specifics =/


A simple memcmp() is all that is needed to check if a key is less than or equal to another key.



Position, UV coords, Normals etc are in floating point precision. How could a memcmp() possibly hold up under those circumstances? Event seemingly identical values will be slightly different and therefore have different binary values.

Also, in this particular case I'm dealing with, we aren't talking about vertices that exactly match so much as we are talking about vertices that almost match. For instance in order for me to get smooth shading to work, I need to be able to average the different normals present for the "same" vertex shared across multiple faces/triangles. Otherwise I'm just left with the flat-shading that I have at the moment. A simple memory compare doesn't seem like the answer here as well, especially when the exact layout of the obj file (does it have all three? is it missing normals? is it missing UVs? Both?) varies...

All of these problems can obviously be solved by taking the index buffer generation process offline as we've already discussed, but my curiosity is getting the better of me and I'm starting to wonder if there aren't some key tricks to this entire thing that I'm missing that go beyond simple index buffer construction. Stuff like dynamic vertex attribute specifications as opposed to static ones (I've got a fairly simple dynamic system implemented at the moment, but it's the exact reason I'm having trouble with a lot of this fast comparison stuff) and type identification (are these values being sent to the GPU floats or ints?). Is there some sort of advanced book on best practices somewhere that I can pick up for this? I've got a copy of the superbible and a few others, but none of them do much more than skim the surface on how to apply this well in a decent scope.

This topic is closed to new replies.

Advertisement