• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
Wartime

VertexBuffer performance issue. Idea for a strategy?

18 posts in this topic

Hi there,

I'm new in this forum.
At our university we have to program a game with DirectX9.

Me and some other students wanted to program a Minecraft-like game, but without unlimited terrain (don't worry [img]http://public.gamedev.net//public/style_emoticons/default/wink.png[/img] )
Now we have a problem with our performance.

Our strategy is, that we have a chunk with 16^3 blocks. We are going through all blocks and look if there is a neighbour above, in front, .....
If there is one we dont put the vertices and indices of this side of the cube into the buffer. This works really quick.

Now we made a class for a chunk. In this class we create the buffers and put in the vertices and indices and save them in a std::vector.

On rendering we fill the buffers with memcpy and draw the primitives.

If i try to draw 4 chunks, everything works fine with 60 fps. But if I try to draw more chunks (e.g. 64) the performance goes down to 8 fps.

I've added my source-code and wanted to ask for a strategy to improve the performance.

Source: [attachment=8313:Kubos.zip]

I hope you understand me (I'm german and my english isn't very well [img]http://public.gamedev.net//public/style_emoticons/default/biggrin.png[/img] )
0

Share this post


Link to post
Share on other sites
Not directly.

I remove the faces bewteen two blocks, not chunks....

Another strategy i use (it's not in the source) is, that I calculate the Normal of the camera and where i look and compare it with the normals of the chunk.
So I only fill in vertices that are visible and not behind, but it doesn't increase the performance.
0

Share this post


Link to post
Share on other sites
I don’t have time right now to see your code and my DX9 knowledge is limited.

But if I understand you well you copy all your vertex data from your CPU to your GPU each frame. If that’s true then your performance will suffer. Instead, you could use dynamic vertex buffers. The problem is when you erase a random block; you should copy the entire chunk back to GPU.

Also, vertex processing is very fast in the GPU. So don’t focus too much in culling per vertex, only for chuncks.

The most important part is to communicate the least you can with your GPU.
0

Share this post


Link to post
Share on other sites
Ok, thanks.

How does Dynamic Vertexbuffer work and how can I use it?

Is there an example or can anybody post some example code?

Thanks
0

Share this post


Link to post
Share on other sites
Why don’t you try first to copy only one time the buffers and see the performance? The scene will be static and won’t be any culling, of course. But if it runs fast then we have information about your bottleneck.

Dynamic buffers only copy to the GPU the information that you change; the problem is the lack of flexibility when you modified the buffer. Search in Google for a deep explanation.
0

Share this post


Link to post
Share on other sites
I've found a bottelneck in my code.
I call SetTexture for every chunk [img]http://public.gamedev.net//public/style_emoticons/default/rolleyes.gif[/img]
Now I'm calling it once and the performance is better.

I've searched for "Dynamic Vertex Buffers", but I don't understand it.
0

Share this post


Link to post
Share on other sites
[quote name='jischneider' timestamp='1334770261' post='4932529']
Why don’t you try first to copy only one time the buffers and see the performance? The scene will be static and won’t be any culling, of course. But if it runs fast then we have information about your bottleneck.
[/quote]

If I fill the Buffers once and draw draw the primitives the program runs with 60 fps.
I think your right, that the bottleneck is copying the std::vectors into the buffers.

Have you got any idea to fix the problem with the performance issue?
0

Share this post


Link to post
Share on other sites
[quote name='Wartime' timestamp='1334781532' post='4932573']
I've found a bottelneck in my code.
I call SetTexture for every chunk [img]http://public.gamedev.net//public/style_emoticons/default/rolleyes.gif[/img]
Now I'm calling it once and the performance is better.
[/quote]

[quote name='Wartime' timestamp='1334782411' post='4932580']
If I fill the Buffers once and draw draw the primitives the program runs with 60 fps.
I think your right, that the bottleneck is copying the std::vectors into the buffers.

Have you got any idea to fix the problem with the performance issue?
[/quote]

Both problems seems related with CPU-GPU communication.

Just copy the buffers when you do modifications. And only copy the chunk being altered.
Therefore:
Load Method: Create a set of chunks.
Update Method: If the player add or remove a block then redo the chunk affected
Render Method: just render the buffers.

If you need even more performance you can improve the update method with dynamic buffers.
Vertex buffers are arrays of information stored in the GPU memory. The problem is that access this memory is costly (for several reason). In consequence you should do the less communication possible. Dynamic buffers are like regular vertex buffers that can be altered with user commands. You are still doing a communication between CPU and GPU, but dynamic buffers allows you to do per vertex, so that less communication is need it.
One more thing, Is the system destroying the memory used for the previous buffers? Like I said I don’t know much of DX9 commands, so don’t ask how to know that.
1

Share this post


Link to post
Share on other sites
Look for the "Performance Optimizations" article in your DXSDK; there's a section on "Using Dynamic Vertex and Index Buffers" that explains how this is done.

Personally I think your std::vector is contributing to your slowdown. Yes, I know the whole "don't use raw pointers/arrays in C++" thing, but dynamic vertex buffers are not intended to be used in this manner, so that part of your code could use some reworking.

The general usage is to Lock the buffer before you do anything. That will give you a pointer, and then [i]you write your data directly into that pointer[/i], following which you unlock. No std::vector, just use the pointer directly. This pattern will avoid any intermediate storage, avoid memory copies, avoid potential runtime memory allocations, and run faster as a result.

For optimal dynamic vertex buffer performance you should ensure that it's created in D3DPOOL_DEFAULT and has usage D3DUSAGE_DYNAMIC and D3DUSAGE_WRITEONLY.

When filling it make sure that you only append to the buffer. So you have a counter starting at 0, Lock from an offset of this counter * vertexsize and size of numverts * vertexsize with D3DLOCK_NOOVERWRITE. When you unlock add numverts to the counter.

If there is no room left in the buffer for your data you will instead Lock with D3DLOCK_DISCARD and offset and size 0, resetting the counter to 0.

Try to keep the number of Lock/Unlock pairs per-frame as low as possible. You should be able to know the number of verts you'll require beforehand, and Lock as much of the buffer as possible.

That should give you optimal performance with a dynamic vertex buffer, and rule that out as a possible cause of slowdowns.
1

Share this post


Link to post
Share on other sites
Here is my code to create a chunk:

[CODE]
void WorldChunk::createChunk()
{
vert_count = 0;
index_count = 0;
index_number = 0;
CUSTOMVERTEX* Vertices;
int* Indices;
cdevice->CreateVertexBuffer( 24 * 16 * 16 * 16 * sizeof( CUSTOMVERTEX ), D3DUSAGE_WRITEONLY, D3DFVF_CUSTOMVERTEX, D3DPOOL_DEFAULT, &VB, NULL );
cdevice->CreateIndexBuffer(36 * 16 * 16 * 16 *sizeof(int),D3DUSAGE_WRITEONLY,D3DFMT_INDEX32,D3DPOOL_DEFAULT,&IB,NULL);
VB->Lock( 0, 0, ( void** )&Vertices, D3DLOCK_DISCARD);
IB->Lock(0,0,(void **)&Indices, D3DLOCK_DISCARD);
for(int x = 0; x < 16; x++)
{
for(int y = 0; y < 16; y++)
{
for(int z = 0; z < 16; z++)
{
//Ist da kein Block zeichnen wir den nicht...
if(chunk[x][y][z] == 0)
continue;
block_type = chunk[x][y][z];
//Befinden wir uns am linken Rand? Dann haben wir keinen linken Nachbarn ansonsten holen wir den aus dem Chunk-Array
//Dasselbe gilt für alle anderen Richtungen (Hab keine Lust, das für jede Abfrage zu wiederholen ;-) )
if(x > 0)
{
testblock = chunk[x-1][y][z];
}else{
testblock = 0;
}
if(testblock == 0)
{
Vertices[vert_count].position = D3DXVECTOR3(x,y,z+1);
Vertices[vert_count].tu = 0.0f+((float)(block_type-1))*0.25f;
Vertices[vert_count].tv = 1.0f;
ver.push_back(Vertices[vert_count]);
Indices[index_count] = index_number;
ind.push_back(Indices[index_count]);
index_count++;

Vertices[vert_count+1].position = D3DXVECTOR3(x,y+1,z+1);
Vertices[vert_count+1].tu = 0.0f+((float)(block_type-1))*0.25f;
Vertices[vert_count+1].tv = 0.0f;
ver.push_back(Vertices[vert_count+1]);
Indices[index_count] = index_number+1;
ind.push_back(Indices[index_count]);
index_count++;
Vertices[vert_count+2].position = D3DXVECTOR3(x,y,z);
Vertices[vert_count+2].tu = 0.25f+(block_type-1)*0.25f;
Vertices[vert_count+2].tv = 1.0f;
ver.push_back(Vertices[vert_count+2]);
Indices[index_count] = index_number+2;
ind.push_back(Indices[index_count]);
index_count++;
Indices[index_count] = index_number+2;
ind.push_back(Indices[index_count]);
index_count++;
Indices[index_count] = index_number+1;
ind.push_back(Indices[index_count]);
index_count++;
Vertices[vert_count+3].position = D3DXVECTOR3(x,y+1,z);
Vertices[vert_count+3].tu = 0.25f+(block_type-1)*0.25f;
Vertices[vert_count+3].tv = 0.0f;
ver.push_back(Vertices[vert_count+3]);
Indices[index_count] = index_number+3;
ind.push_back(Indices[index_count]);
index_count++;
index_number += 4;
vert_count += 4; //Erhöhe den Zähler um 6 , weil wir 6 Vertices gezeichnet haben...
}

if(x < 16-1)
{
testblock = chunk[x+1][y][z];
}else{
testblock = 0;
}
if(testblock == 0)
{
Vertices[vert_count].position = D3DXVECTOR3(x+1,y,z+1);
Vertices[vert_count].tu = 0.0f+(block_type-1)*0.25f;
Vertices[vert_count].tv = 1.0f;
ver.push_back(Vertices[vert_count]);
Indices[index_count] = index_number;
ind.push_back(Indices[index_count]);
index_count++;

Vertices[vert_count+1].position = D3DXVECTOR3(x+1,y+1,z+1);
Vertices[vert_count+1].tu = 0.0f+(block_type-1)*0.25f;
Vertices[vert_count+1].tv = 0.0f;
ver.push_back(Vertices[vert_count+1]);
Indices[index_count] = index_number+1;
ind.push_back(Indices[index_count]);
index_count++;
Vertices[vert_count+2].position = D3DXVECTOR3(x+1,y,z);
Vertices[vert_count+2].tu = 0.25f+(block_type-1)*0.25f;
Vertices[vert_count+2].tv = 1.0f;
ver.push_back(Vertices[vert_count+2]);
Indices[index_count] = index_number+2;
ind.push_back(Indices[index_count]);
index_count++;
Indices[index_count] = index_number+2;
ind.push_back(Indices[index_count]);
index_count++;
Indices[index_count] = index_number+1;
ind.push_back(Indices[index_count]);
index_count++;
Vertices[vert_count+3].position = D3DXVECTOR3(x+1,y+1,z);
Vertices[vert_count+3].tu = 0.25f+(block_type-1)*0.25f;
Vertices[vert_count+3].tv = 0.0f;
ver.push_back(Vertices[vert_count+3]);
Indices[index_count] = index_number+3;
ind.push_back(Indices[index_count]);
index_count++;
index_number += 4;
vert_count += 4; //Erhöhe den Zähler um 6 , weil wir 6 Vertices gezeichnet haben...
}
if(y > 0)
{
testblock = chunk[x][y-1][z];
}else{
testblock = 0;
}
if(testblock == 0)
{
Vertices[vert_count].position = D3DXVECTOR3(x,y,z);
Vertices[vert_count].tu = 0.0f+(block_type-1)*0.25f;
Vertices[vert_count].tv = 1.0f;
ver.push_back(Vertices[vert_count]);
Indices[index_count] = index_number;
ind.push_back(Indices[index_count]);
index_count++;

Vertices[vert_count+1].position = D3DXVECTOR3(x,y,z+1);
Vertices[vert_count+1].tu = 0.0f+(block_type-1)*0.25f;
Vertices[vert_count+1].tv = 0.0f;
ver.push_back(Vertices[vert_count+1]);
Indices[index_count] = index_number+1;
ind.push_back(Indices[index_count]);
index_count++;
Vertices[vert_count+2].position = D3DXVECTOR3(x+1,y,z);
Vertices[vert_count+2].tu = 0.25f+(block_type-1)*0.25f;
Vertices[vert_count+2].tv = 1.0f;
ver.push_back(Vertices[vert_count+2]);
Indices[index_count] = index_number+2;
ind.push_back(Indices[index_count]);
index_count++;
Indices[index_count] = index_number+2;
ind.push_back(Indices[index_count]);
index_count++;
Indices[index_count] = index_number+1;
ind.push_back(Indices[index_count]);
index_count++;
Vertices[vert_count+3].position = D3DXVECTOR3(x+1,y,z+1);
Vertices[vert_count+3].tu = 0.25f+(block_type-1)*0.25f;
Vertices[vert_count+3].tv = 0.0f;
ver.push_back(Vertices[vert_count+3]);
Indices[index_count] = index_number+3;
ind.push_back(Indices[index_count]);
index_count++;
index_number += 4;
vert_count += 4; //Erhöhe den Zähler um 6 , weil wir 6 Vertices gezeichnet haben...
}
if(y < 16-1)
{
testblock = chunk[x][y+1][z];
}else{
testblock = 0;
}
if(testblock == 0)
{
Vertices[vert_count].position = D3DXVECTOR3(x,y+1,z);
Vertices[vert_count].tu = 0.0f+(block_type-1)*0.25f;
Vertices[vert_count].tv = 1.0f;
ver.push_back(Vertices[vert_count]);
Indices[index_count] = index_number;
ind.push_back(Indices[index_count]);
index_count++;

Vertices[vert_count+1].position = D3DXVECTOR3(x,y+1,z+1);
Vertices[vert_count+1].tu = 0.0f+(block_type-1)*0.25f;
Vertices[vert_count+1].tv = 0.0f;
ver.push_back(Vertices[vert_count+1]);
Indices[index_count] = index_number+1;
ind.push_back(Indices[index_count]);
index_count++;
Vertices[vert_count+2].position = D3DXVECTOR3(x+1,y+1,z);
Vertices[vert_count+2].tu = 0.25f+(block_type-1)*0.25f;
Vertices[vert_count+2].tv = 1.0f;
ver.push_back(Vertices[vert_count+2]);
Indices[index_count] = index_number+2;
ind.push_back(Indices[index_count]);
index_count++;
Indices[index_count] = index_number+2;
ind.push_back(Indices[index_count]);
index_count++;
Indices[index_count] = index_number+1;
ind.push_back(Indices[index_count]);
index_count++;
Vertices[vert_count+3].position = D3DXVECTOR3(x+1,y+1,z+1);
Vertices[vert_count+3].tu = 0.25f+(block_type-1)*0.25f;
Vertices[vert_count+3].tv = 0.0f;
ver.push_back(Vertices[vert_count+3]);
Indices[index_count] = index_number+3;
ind.push_back(Indices[index_count]);
index_count++;
index_number += 4;
vert_count += 4; //Erhöhe den Zähler um 6 , weil wir 6 Vertices gezeichnet haben...
}
if(z > 0)
{
testblock = chunk[x][y][z-1];
}else{
testblock = 0;
}
if(testblock == 0)
{
Vertices[vert_count].position = D3DXVECTOR3(x,y,z);
Vertices[vert_count].tu = 0.0f+(block_type-1)*0.25f;
Vertices[vert_count].tv = 1.0f;
ver.push_back(Vertices[vert_count]);
Indices[index_count] = index_number;
ind.push_back(Indices[index_count]);
index_count++;

Vertices[vert_count+1].position = D3DXVECTOR3(x,y+1,z);
Vertices[vert_count+1].tu = 0.0f+(block_type-1)*0.25f;
Vertices[vert_count+1].tv = 0.0f;
ver.push_back(Vertices[vert_count+1]);
Indices[index_count] = index_number+1;
ind.push_back(Indices[index_count]);
index_count++;
Vertices[vert_count+2].position = D3DXVECTOR3(x+1,y,z);
Vertices[vert_count+2].tu = 0.25f+(block_type-1)*0.25f;
Vertices[vert_count+2].tv = 1.0f;
ver.push_back(Vertices[vert_count+2]);
Indices[index_count] = index_number+2;
ind.push_back(Indices[index_count]);
index_count++;
Indices[index_count] = index_number+2;
ind.push_back(Indices[index_count]);
index_count++;
Indices[index_count] = index_number+1;
ind.push_back(Indices[index_count]);
index_count++;
Vertices[vert_count+3].position = D3DXVECTOR3(x+1,y+1,z);
Vertices[vert_count+3].tu = 0.25f+(block_type-1)*0.25f;
Vertices[vert_count+3].tv = 0.0f;
ver.push_back(Vertices[vert_count+3]);
Indices[index_count] = index_number+3;
ind.push_back(Indices[index_count]);
index_count++;
index_number += 4;
vert_count += 4;//Erhöhe den Zähler um 6 , weil wir 6 Vertices gezeichnet haben...
}
if(z < 16-1)
{
testblock = chunk[x][y][z+1];
}else{
testblock = 0;
}
if(testblock == 0)
{
Vertices[vert_count].position = D3DXVECTOR3(x,y,z+1);
Vertices[vert_count].tu = 0.0f+(block_type-1)*0.25f;
Vertices[vert_count].tv = 1.0f;
ver.push_back(Vertices[vert_count]);
Indices[index_count] = index_number;
ind.push_back(Indices[index_count]);
index_count++;

Vertices[vert_count+1].position = D3DXVECTOR3(x,y+1,z+1);
Vertices[vert_count+1].tu = 0.0f+(block_type-1)*0.25f;
Vertices[vert_count+1].tv = 0.0f;
ver.push_back(Vertices[vert_count+1]);
Indices[index_count] = index_number+1;
ind.push_back(Indices[index_count]);
index_count++;
Vertices[vert_count+2].position = D3DXVECTOR3(x+1,y,z+1);
Vertices[vert_count+2].tu = 0.25f+(block_type-1)*0.25f;
Vertices[vert_count+2].tv = 1.0f;
ver.push_back(Vertices[vert_count+2]);
Indices[index_count] = index_number+2;
ind.push_back(Indices[index_count]);
index_count++;
Indices[index_count] = index_number+2;
ind.push_back(Indices[index_count]);
index_count++;
Indices[index_count] = index_number+1;
ind.push_back(Indices[index_count]);
index_count++;
Vertices[vert_count+3].position = D3DXVECTOR3(x+1,y+1,z+1);
Vertices[vert_count+3].tu = 0.25f+(block_type-1)*0.25f;
Vertices[vert_count+3].tv = 0.0f;
ver.push_back(Vertices[vert_count+3]);
Indices[index_count] = index_number+3;
ind.push_back(Indices[index_count]);
index_count++;
index_number += 4;
vert_count += 4; //Erhöhe den Zähler um 6 , weil wir 6 Vertices gezeichnet haben...
}
}
}
}
//****************************************************************************************************************
IB->Unlock();
VB->Unlock();
[/CODE]

If it's created i use the vectors to "quick fill" the buffers in the next frame (the vertices don't change)

[CODE]
void WorldChunk::QuickFill()
{
void* vv = NULL;
void* ii = NULL;
cdevice->CreateVertexBuffer( ver.size()*sizeof(CUSTOMVERTEX),D3DUSAGE_DYNAMIC, D3DFVF_CUSTOMVERTEX, D3DPOOL_DEFAULT, &VB, NULL );
cdevice->CreateIndexBuffer(ind.size()*sizeof(int),D3DUSAGE_DYNAMIC,D3DFMT_INDEX32,D3DPOOL_DEFAULT,&IB,NULL);
VB->Lock(0,(ver.size()+10)*sizeof(CUSTOMVERTEX),(void**)&vv,D3DLOCK_DISCARD);
memcpy(vv,&ver[0],ver.size()*sizeof(CUSTOMVERTEX));
VB->Unlock();
IB->Lock(0,(ind.size()+10)*sizeof(int),(void**)&ii,D3DLOCK_DISCARD);
memcpy(ii,&(ind[0]),ind.size()*sizeof(int));
IB->Unlock();
}
[/CODE]
0

Share this post


Link to post
Share on other sites
If the verts don't change you do not need to update the vertex buffer, and for what you are doing you really want to use a dynamic vertex buffer. Basically any mesh that updates fairly frequently should be stored in a dynamic vertex buffer, any mesh that doesn't should be in a static one.

One more thing that will help you is to not update the vertex buffers when they haven't changed from the last frame, the fastest data you send to the GPU is data you never send. Rendering with the same vertex buffer as the last frame when it hasn't changed will be the same as sending the same buffer again.
1

Share this post


Link to post
Share on other sites
If you're calling CreateVertexBuffer and CreateIndexBuffer every frame, that explains why things slow down. Object creation is an expensive operation and should only be done during startup. In this particular case, you could use DrawPrimitiveUP instead of DrawPrimitive and it would run a lot faster - although completely reworking your code to use vertex buffers properly would be the real solution.
1

Share this post


Link to post
Share on other sites
Thank you both.

If I understand you the solution is:
[list=1]
[*]Create the Vertex and Idexbuffer once on Startup.
[*]Fill the buffers until they are full
[*]Draw the Primitives
[*]Clear Buffers
[*]If there are more Vertices go to 2
[/list]


If this ist right, I've got another question:

What is if I walk around on the map.
The chunks i view change permanent if I turn around or go forward for a long time. So I have to change the buffer-content all the time.
How can I do this? Or is there another solution.
0

Share this post


Link to post
Share on other sites
You got the idea yes, but you need to only fill the buffers when they change no change no update. Change happens either when a chunk comes into the view area or leaves it, or when a block in a visible chunk changes.

Also it isn't bad to have an in system memory buffer of the vertices in the list, it's just that you only send this list when the chunk is visible or a change has happened to it.
[code]
class Chunk
{
public:
void update()
{
//if you add or remove blocks from this chunk mark m_dirty = true so that you reupload the vb and ib
}
private:
bool m_dirty; //Only update the render buffers when this flag is set.
std::vector<Vertex> m_localChunkVertexData; //Change this in the update function and you need to reupload them to the GPU, but only change it when it is actuall there.
std::vector<unsigned int> m_localIndexData;
}
[/code]

This will allow you to change the vertex list without having to lock the vertex or index buffers untill you are ready to upload to the device. Those vectors can also be local update function members which you write to the VB and IB once you have filled them out with the update you wanted.
1

Share this post


Link to post
Share on other sites
OK,

Still one question:

Situation:
I filled the buffer until it's full. Now I draw the Vertices and flush the Buffer.
Second thing is that I fill it again with other vertices (because the buffer was full) and render.

If I re-render the frame (nothing has changed) i have to fill the buffer twice.
Once with the first data and then with the second to redraw all vertices, or?
0

Share this post


Link to post
Share on other sites
One option is to just create a bigger buffer - make it large enough to hold data for an entire frame worth of drawing, and don't bother worrying about this.

That may not always be possible. Depending on how much you're drawing a full frame's worth of data may be too much. In that case don't worry about it either - just fill and flush the buffer as you need.

The important thing to remember is that there is no guaranteed one-size-fits-all approach to this. Depending on your application's needs you'll be making adjustments to the recommended basic approach. Sometimes you'll keep a system memory copy, sometimes you won't, sometimes you don't bother refilling the buffer if data doesn't change, sometimes it's not that big a deal and is cheaper to just fill the buffer anyway, and sometimes using a group of smaller static vertex buffers is preferable to using one big dynamic buffer.
1

Share this post


Link to post
Share on other sites
[quote name='mhagain' timestamp='1334852020' post='4932851']
One option is to just create a bigger buffer - make it large enough to hold data for an entire frame worth of drawing, and don't bother worrying about this.

That may not always be possible. Depending on how much you're drawing a full frame's worth of data may be too much. In that case don't worry about it either - just fill and flush the buffer as you need.

The important thing to remember is that there is no guaranteed one-size-fits-all approach to this. Depending on your application's needs you'll be making adjustments to the recommended basic approach. Sometimes you'll keep a system memory copy, sometimes you won't, sometimes you don't bother refilling the buffer if data doesn't change, sometimes it's not that big a deal and is cheaper to just fill the buffer anyway, and sometimes using a group of smaller static vertex buffers is preferable to using one big dynamic buffer.
[/quote]
With VB's and IB's the trick is to find the best batch size that works across the set of cards you want to support. I am making a maze crawling game with a lot of 4 verts squares which make up the wall, when I submitted these as seperate drawcalls my performance tanked massively with less then 100.000 verts on screen. When I batched it up into a single vertex buffer performance jumped back to a solid 60 (vsync) on a HD4850.

A good rule of thumb is to try and get about 10.000 verts per vertex buffer if you have massive amounts of vertices to draw, this number can change according to situation and profiling ofcourse.

GPU's are bad at drawing buffers with a low amount of verts in them as most of the card is doing nothing then, but go over a threshold and performance dies as well as the card is too busy to deal with all the data you give it. It's a balance you have to find through some trial and error and profiling.
1

Share this post


Link to post
Share on other sites
Thank you for your help.

Our game is going full speed ahead.

Here is the newest version:

[attachment=8473:Kubos.rar]
0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0