Jump to content
  • Advertisement

Archived

This topic is now archived and is closed to further replies.

z9u2K

Very strange bottleneck...

This topic is 5834 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I''m using DX8.1 for my terrain ngine which is based on the ROAM algorithm... THe engine works like magic only I have a bottle neck in it... it renders 5,000 tris at 5 fps on my GeForce3... I foundout that my bottleneck lies in the Lock(); Unlock(); calls I do with my vertxbuffer everytime I reach an end leaf... I can''t find a way to render my landscape without rendering each triangle at a time... any ideas??

Share this post


Link to post
Share on other sites
Advertisement
Hi there,

i dont know much about 3D, but isnt there a way that you could programaticly precalculate the ''terrain'' in some sort of array or list, then after all the calculation is done, render the terrain know exactly what tri''s nees to be renderd, that way you can lock then at once and unlock them all after.

i belive it falls into the same kind of idea as DirectDraw surfaces lock and unlock, in that you want to do as many things as possible per lock, so some precalculation may be your answer.


hope that helps, like i said i cant give any specifics casue i really dont know what your doing.



Raymond Jacobs,
Developer,
www.EtherealDarkness.com

Share this post


Link to post
Share on other sites
Maybe you''re locking your vertices in a wrong way. Would you explain the properties of your created vertex buffer? Also, what flags are you sending to the Lock function?
Are you using Index Buffers? If not, are you drawing as TRIANGLELIST or strips?
Is the vertex information on the buffer arranged?
What''s the size of your FVF?
How many primitives are you batching per drawing? Are you grouping the primitives per render state changes (ie. drawing all the vertices who uses X texture, then drawing all the vertices who uses the other texture, etc.)?

Locking and Unlocking won''t cause CPU bottlenecks unless the GPU part is correctly programmed, in fact, on nowadays computers and compilers, copying memory can be lighting fast if it isn''t abused.

I would recommend you downloading some performance papers from nVidia (developer.nvidia.com), they have very interesting topics about the correct use of the GPU.

Share this post


Link to post
Share on other sites
Yep. Lock/Unlock calls should be kept to a minimum. Build up your array then copy it all in one lock/unlock. Check the DX docs, some modes perform better than others. The default mode isnt too bad, but still something you should _not_ do every poly.

In the DirectX8 Docs theres a good section on Lock/Unlock combos and their ''relative'' performance comparisons.
Section - "Using Dynamic Vertex and Index Buffers"

I assume your not making a new vertbuffer every call (thats always a good way to stuff things up :-).

Tips for making polys go quick:
- reduce texture swapping, try to render all polys of the same texture together. swapping textures can be a killer.
- strip polys, which reduces the number of vertices you need to send to the card.
- index buffers are good, and can help with the above to performance issues.
- precalc or prebuild as much vertex and state information as you can, then blast it across when its needed. PC''s have huge storage capabilites, so doing this is well worth it.

Hope this helps.

Share this post


Link to post
Share on other sites
the thing is that I can''t know the number of triangles that is to be renered untill render time....

I though about pushing triangle into a linked list and then creating one vertexbuffer and spill all the triangles to it... the thing is that I''m afraid to use linked lists since then are SLLLOW with a big amount of data....

Share this post


Link to post
Share on other sites
Don''t even think about using linked list for this, that''s way too slow. Figure out a maximum number of polys that are going to be used, and allocate enough memory at init of the program. Then jusr build your indexlist in this memory, and copy the list to the indexbuffer before rendering.

T

--
MFC is sorta like the swedish police... It''''s full of crap, and nothing can communicate with anything else.

Share this post


Link to post
Share on other sites
To get the amount of primitives to be rendered is easy, just divide the amount of indices by 3, I don''t see a problem there. Again, I think you''re missusing the GPU, why don''t you answer me the questions I did?

Share this post


Link to post
Share on other sites
Well MatuX, You''ve asked a lot of questions... I''ll try answering them all...

For start, I create my vertex buffer like this:

g_pD3DDevice->CreateVertexBuffer(dwNum * sizeof(MYVERTEX), 0, D3DFVF_MYVERTEX, D3DPOOL_DEFAULT, &g_pBuffer));

Nothing specail.

D3DFVF_MYVERTEX defined as:
#define D3DFVF_MYVERTEX (D3DFVF_XYZ | D3DFVF_NORMAL | D3DFVF_DIFFUSE | D3DFVF_TEX1)

I am not using Index Buffer since I am rendering my terrain triangle-by-triangle. Instate, I use D3DPT_TRIANGLELIST, although that''s not nessecery either since I am only batching one triangle per draw.

As I''e said, The vertices are not arranged in the buffer since the buffer only holds one poly (3 vertices) at a time.

The size of my FVF you can figure out from the above definition (36 bytes I think)

I know that batching one triangle per draw is VERY UNOPTIMIZED, but I can''t do it elseway since I don''t want to create a pool that will hold the miximum amount of triangles possible...

Currently I am not implementing textures or materials so grouping is unneeded at a time.

Hope that gives you a bit more information about my VertexBuffer...

10x for ur help.

Share this post


Link to post
Share on other sites
So you''re drawing one poly at a time (which is slow), and locking the vertex buffer for each (also very slow)?

You need to batch things up so you''re rendering as many polys as you can in one go, using one vertex buffer.

This isn''t too hard, you can build up a list of triangles while traversing your data, then whack it all in one vertex buffer, then draw the lot with one call to DrawPrimitive or whatever.


Helpful links:
How To Ask Questions The Smart Way | Google can help with your question | Search MSDN for help with standard C or Windows functions

Share this post


Link to post
Share on other sites
batching only a single triangle per call? ARE YOU MAD?!?!

you MUST, i repeat MUST draw at least 100 or so triangles per call to get any speed at all. for your terrain, their should be a single lock()/unlock() pairs per frame since you are only rendering a max of 15000 vertices. you may wish to break the vertices batches up and use a vertex buffer that is only 3000 vertices in size (see the particle sdk sample to see how to render large amounts of dynamic vertices in a somewhat efficent manner).

you should be able to batch more then a traingle per call, if not then i HIGHLY suggest you learn some memory mangment and more about dynamic memory (especially things like circular buffers). you should never have to allocate any memory during yoru rendering of the terrain (unless somethign special in the ROAM algo requires it, but i dont recall anything requiring it).

basically you want to:
(PSEUDO code)
doingROAM=TRUE;
while(doingROAM)
{
actualVertexCount = FillCircularBufferUsingROAM(1000);
LockVertxBuffer(actualVertexCount)
CopyCircularBufferTopVertexBuffer(actualVertexCount);
UnlockVertexBuffer();
DrawPrimitive(actualVertexCount/3);
if(actualVertexCount<1000)
doingROAM=FALSE;
}

drawing only a single triangle per call is so unoptimized that no matter how well you could the rest of the game it owuld run that slow. z9u2K, drawing a single triangle at a time is not the only way you know how to do it. yoru just being lazy and not generalzing concpets you should know about programmign and problem solving. you should see things like particle samples in the sdk since after all its solving the EXACT same rendering problem you are currently having. this happens to be: how to you render a group of dynamic vertices in which you dont know how many may be created during runtime nor can allocate a maxium since they wont all be on the screen at once or the maxium would be too high? answer, see the sdk (which basically does what i shown above, though you may need to see actual code which means you should practice a bit more and learn some more about the basic/intermeadiate aspects of coding before going on to large projects like creating a terrain engine in dx8.1).

Share this post


Link to post
Share on other sites

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!