Mesh VS VertexBuffer?

Started by
9 comments, last by Mathy 14 years, 1 month ago
If I have a shape that has 2 faces, and 3 vertices (a triangle), what would be fastest for me to render? From a VertexBuffer with the vertices of that triangle, or from a mesh? It is important to note that I want to render several thousands of rectangles.
Advertisement
(Assuming D3D9) A mesh is just a vertex and index buffer internally, so they should have the exact same performance.

However, if you're rendering thousands of rectangles and the rectangles are dynamic (Change position / size frequently), you'll probably be easiest with a vertex buffer, since you'll want to be updating it frequently - Making thousands of draw calls to draw 1 or 2 triangles is going to destroy your frame rate.
Thanks! Perfect response!
I think your definition of a "mesh" is a little bit wrong. A mesh is just a collection of vertex data. Vertices, texture coordinates, normals, maybe tangents, maybe weights, indices, etcetera.

VertexBuffers are a specific way to store and render a mesh. You can render that mesh by calling the vertex functions each cycle. In OpenGL that would be
glBegin( GL_POLYGON );glVertex3f( x,y,z );...glEnd;OR, for larger chunks of data, pass an entire array of vertices or indices in 1 call:glDrawElements( indices .... );glDrawArray( vertices ... );

In this case, you let the CPU pass coordinates that are computed dynammically or loaded from the RAM (mesh stored in the RAM).

VertexBuffer Objects (VBO's) can boost speed by putting that mesh structure inside your videocard memory instead of the RAM. VideoCard memory is specialized for this kind of stuff, and you save bandwidth as the CPU doesn't need to pass all those coordinates (physically). The CPU has the handle of your vertexbuffer object and can use that to let the videocard render/use it. Same idea as with textures:
1.- Load image data in RAM from a bitmap for example
2.- Generate a Texture Object on the videocard to store that image
3.- Pass the image data to the texture object
4.- Use it (glBindTexture( handle ... ))

However, once your mesh is in the video mem. its difficult/relative slow to alter it. If your mesh shape changes all the time, VBO's are probably not a good idea (unless you can do all the reshaping in a vertex or geometry shader). Same as with textures, you would need to reload the object each time it changes = slow.



The bigger the mesh (vertex count), the more reason to pass entire array's use indices, or even store it in the videocard mem. But for such small meshes (a few polygons)... I'm not sure what would be the best way, but VBO's are most probably overdone. Certainly if the mesh coordinates will change all the time.

However, you have thousands of quads(rectangles). Check if you can make a FIXED number. If you always (or at max) use, let's say 10.000 quads, you could:
1.- Make a list of 10.000 quads, all centered at 0,0,0.
2.- Upload this list to the video mem. create a VBO of this list
3.- When rendering, use a vertex shader with a lookup source to position/rotate/scale each quad. Quads that are invisible at the time because less is used can be scaled to 0,0. They're still there, but invisible.

Just an idea. If the number or shapes of quads are very random/dynamic, this probably won't work though. You may want to take a look at particle systems.


Good luck,
Rick


Wow Spek! Great! Thank you!

I think I'll read more about storing stuff in the video card memory. My objects are very dynamic though, and for each frame, they do physics checking.
Quote:Original post by Mathy
I think I'll read more about storing stuff in the video card memory. My objects are very dynamic though, and for each frame, they do physics checking.


You shouldn't be using your vertices directly for physics checking (at least generally).

Ideally you should store a simplified representation of your object on the CPU side, for example a position and a bounding volume, and update and check physics against this.

You then use a generic vertex buffer for a quad, a cube, whatever which you create once and never update, and then use your world matrix (derived from your simplified physics data) to position it without changing the vertices directly.

For several thousand identical objects you could look into instancing, either hardware or shader, which will greatly reduce the amount of draw calls you make, although based on your current questions etc. this may be beyond you at the moment, but I mention it so you know where to look in the future.

Instancing is way too complicated for me. I looked into it, and it seems I have to do make some shaders that does it. How would I do that if I want to individually specify the position of each object that should be instanced?
I'm fairly sure you can't use instancing with the FFP (don't quote me on that though), so you would have to write a shader, you'd also have to setup your vertex declaration correctly for instancing, and a couple of other bits.

I'll describe the basic concept though, since I don't want to go into the details of implementation.

Imagine you have a vertex buffer with a single cube. You want to draw this cube 3 times. So you send the following the GPU:

Cube Buffer
World Matrix 1
Draw Call

Cube Buffer
World Matrix 2
Draw Call

Cube Buffer
World Matrix 3
Draw Call

Your shader would look vaguely like this:

extern uniform float4x4 world;VS_INPUT{    float4 Pos : POSITION;}float4 vs(VS INPUT In){    return mul(In.Pos, world);}


With instancing you instead send:

Cube Buffer
World Matrix 1, 2, 3 (in a buffer)
Draw Call

The GPU will then interleave the streams (repeating the first one) based on the parameters you pass it (and your VertexDeclaration), so it sees the following:

Cube Buffer, World Matrix 1, Cube Buffer, World Matrix 2, Cube Buffer, World Matrix 3

With your shader looking something like this:

VS_INPUT{    float4 Pos : POSITION;    float4 World : TEXCOORD0;}float4 vs(VS INPUT In){    return mul(In.Pos, In.World);}


That's a really poor explaination now I look back on it, but hopefully it helps you at least grasp the basic underlying concept.
Is there a class reference for HLSL? I don't even know what the "mul" function does. I could imagine it multiplies, but you never know.

What you've just shown me there is quite interesting! So basically you pass an array of cubes and an array of "worlds" as matrix format, and then the shader takes care of it, and since HLSL is run on the GPU, it consumes no CPU?
You simply pass a single cube rather than an array of them, which the GPU will use multiple times (as if it was an array) but referencing a different world matrix each time.

Now, the other alternative is called shader instancing, which does pass an array of both and doesn't involve messing around with streams.

Instead you add an index to your cube declaration, and fill up a vertex buffer with a reasonable number of cubes each with a different index (for a cube you could happily put ~200 in one buffer, more complex geometry, much less).

IE.

struct InstancedVertex{    Vector3 Position    float Instance};InstancedVertex[] vertices = new InstancedVertex[MAX_NUM_CUBES * 8]; // 8x vertices per cubefor (int i = 0; i < MAX_NUM_CUBES; i++){    /* I'm sure you can figure out the positions needed for a cube, so I'll omit them */    vertices = new InstancedVertex(){ Position = new Vector3(...), Instance = i};    ...    vertices[i+7] = new InstancedVertex(){ Position = new Vector3(...), Instance = i};}// Fill Vertex Buffer with the above


So now you have a vertex buffer with 200 cubes in it, each with a unique ID.

You then fill up an array with your world matrices:

Matrix[] instances = new Matrix[MAX_NUM_CUBES];// Fill your array each frame, a List is probably better, using ToArray() to pass to the shader, but this is only an example


You then setup a shader like the following:

extern uniform float4x4 world[MAX_NUM_CUBES];VS_INPUT{    float4 Pos : POSITION;      /* I know it's defined as a float3 and a float in the C# part, but thats for clarity, the shader can     read both together as a float4. Assuming you setup your VertexDeclaration correctly*/}float4 vs(VS INPUT In){    return mul(float4(In.Pos.xyz, 1.0f), world[In.Pos.w]);}


And pass in your instance array so you would any other shader variable:

Effect.SetValue(instances);


Shader instancing is slower than hardware instancing and requires more code and you're also limited by the register counts of the shader model you're using, however it is massively simpler to understand and setup compared to hardware instancing, since you don't have to muck about with your VertexDeclaration too much and never have to call SetStreamFrequency.

This topic is closed to new replies.

Advertisement