Archived

This topic is now archived and is closed to further replies.

Tok

Too Many Primatives? (with pic)

Recommended Posts

Hey all. I'm working on some code for a heightfield, and so far everything is working wonderfully- that is, except for the (ouch) 14 frames per second I'm getting. How many primatives can DirectX (8 / debug version) handle? In this version, I'm displaying 9 "terrain blocks", each "block" a node in a singly-linked list, with its own vertex buffer for a triangleLIST. ALL the program does after init is move the camera position on input, and draw the scene- I think the 14fps has to do with my 2 primatives/square, 400 squares/block, 9 blocks... 7200 primatives, though, that's strange considering the "optimized mesh" example with the Dx8sdk has fps 64, drawing 46,464 triangles. Or are meshes so much faster to render than trianglelists? Thanks, all... -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Feel free to email me HERE *Howdy Kids, Do /YOU/ know what time it is? It's tangent time* -Baldor the Bold Edited by - Tok on November 8, 2001 10:32:07 PM

Share this post


Link to post
Share on other sites
A mesh is just a relatively thin wrapper around an index buffer and a vertex buffer, so the question of "list vs. mesh" shouldn''t really be an issue, although optimizations on what''s inside those buffers can make a difference.

Also, it''s not a question of "how much can DX8 handle", but rather how much can your hardware handle. One suggestion is to go to madonion.com and look for the stats of systems similar to yours. You probably won''t reach those numbers, but try to be in the ballpark. (~100K polys/s seems really low)

Please describe what you are doing... Are they indexed lists? Are you rendering many at a time, are you using the UP functions, etc. etc...

Share this post


Link to post
Share on other sites
when theysay optimized mesh, they usually mean in triangle strips or fans, graphics cards are optimized for calculations on primitives presented in such a format. so try to send vertex data in triangle strips/fans, that should increase your fps.

Share this post


Link to post
Share on other sites
This may not make any difference, but I notice in the picture you are drawing a wireframe outline as well. On some cards I found this to be dog slow, perhaps try solid only?

Another performance issue is the number of primitives sent in a single DrawPrimitive call. if its only a few (less than 50 say) and you are doing a 100 of these calls per frame it would be slower than calling 1 DrawPrimitive will 5000 primitives.

General rule of thumb, more primitives in a single vert buffer is faster than many little vert buffers.

[The views stated herein do not necessarily represent the view of the company Eurocom ]

Eurocom Entertainment Software
www.Eurocom.co.uk

The Jackal

Share this post


Link to post
Share on other sites
First off, thank you ALL for your wonderful contributions. In response to your comments:

quote:
Original post by CrazedGenius
Also, it''s not a question of "how much can DX8 handle", but rather how much can your hardware handle. One suggestion is to go to madonion.com and look for the stats of systems similar to yours. You probably won''t reach those numbers, but try to be in the ballpark. (~100K polys/s seems really low)

I''m running a PIII 667 with an ASUS GeForce256 32meg v6600 graphics card. It''s a couple years old but still plenty for anything I''ve put it up against. (including burning/editing avi off tv/vcr, and playing games in full 3d with the included 3d goggles) And like I''ve said, I''ve seen fps of 64 drawing 46,000 primatives. Heck, in the same example "OptimeizedMesh in Direct3D" I get a framerate of 19fps when drawing 185,000 primatives, FULLSCREEN. My program is windowed at 640x480 (displayed on a 1024x768 desktop).

quote:
Please describe what you are doing... Are they indexed lists? Are you rendering many at a time, are you using the UP functions, etc. etc...

At this point Alladin swoops in through the window with genie, carpet, and Aboo. "A whole new world" swells in the background No, they are not indexed lists. The whole prog is actually a *huge* mutation of tutorial 5: textures, which used no indexing. Indexing is one thing I have sedidn''t feel the need to study as yet, and in fact have seen little mention of, in my studies of the examples.
As far as what I''m doing... (you can find the source here)I am rendering the terrain "blocks" by running through the singly-linked list in Render(), like so:

for(TerrainBlock* Terrain = TerrainHead;Terrain;Terrain=Terrain->next)
{
g_pd3dDevice->SetStreamSource( 0, Terrain->data, sizeof(CUSTOMVERTEX) );
g_pd3dDevice->SetVertexShader( D3DFVF_CUSTOMVERTEX );
g_pd3dDevice->DrawPrimitive(D3DPT_TRIANGLELIST, 0, 800);
}

where TerrainBlock is a struct defined as:

struct TerrainBlock
{
LPDIRECT3DVERTEXBUFFER8 data; // The verticies for this block of terrain
D3DXVECTOR3 size; // The x,y,z size of block of terrain (think bounding box)
D3DXVECTOR3 pos; // The x,y,z position of block of terrain (think 0,0,0)
TerrainBlock* next; // The next in list
TerrainBlock* prev; // The previous in list
};

So as far as the vertex buffers go, I''m only calling them 9 times, each being a grid of 20x20 with 2 primatives per square (800 primatives/call to DrawPrimative)
quote:
Original post by Mark Duffill
This may not make any difference, but I notice in the picture you are drawing a wireframe outline as well. On some cards I found this to be dog slow, perhaps try solid only?

Actually, my program alternates between point, wireframe, and textured modes (by pressing keys 1-3), and there''s a difference in FPS of *perhaps* 1 or 2 frames, nothing more.
quote:

nother performance issue is the number of primitives sent in a single DrawPrimitive call. if its only a few (less than 50 say) and you are doing a 100 of these calls per frame it would be slower than calling 1 DrawPrimitive will 5000 primitives.

Again, I''m doing 9 calls of 800 primatives each.
quote:
Original post by EbonySeraph
Also calls to DrawPrimitiveUP (Unprocessed) are a lot slower than the others.


At the moment, it''s looking like working in indexing is going to be my best bet. I''ve done the route of making trianglestrips before... but that didn''t work so well with the grid (I would have to make an array of strips, perhaps... then I would be doing 20 calls to DrawPrimative for 20 strips, unless I resize the grids. Hmmm...

Thank you all for your contributions...

-Tok








-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Feel free to email me HERE
*Howdy Kids, Do /YOU/ know what time it is? It''s tangent time* -Baldor the Bold

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
quote:
Original post by Tok
I get a framerate of 19fps when drawing 185,000 primatives, FULLSCREEN. My program is windowed at 640x480 (displayed on a 1024x768 desktop).




This makes it sound like you expect fullscreen to be slower than windowed mode, when with most cards the opposite is true.

Have you tried your program in fullscreen?

Share this post


Link to post
Share on other sites
two quick points...

1. No need to set the shader each time unless you are actually changing it.

2. Indexed buffers = much less data... For a simple quad (two triangles), you could represent it as 6 vertices (3 per tri), or 4 vertices (2 shared vertices between the tris). The savings becomes more substantial at larger scales...

I''ll try to look at the code later...

Share this post


Link to post
Share on other sites
My terrain program is rendering about 110 fps with 45,000+ triangles in 640x480 windowed, with some inefficient programming, on a geforce 2 Mx 1.4 gHz althlon. I am using index buffers and hardware transforms and the retail, not the debug libary. I am guessing it is either the debug build or the software transforms that is killing your frame rate.





Edited by - invective on November 9, 2001 12:21:16 AM

Share this post


Link to post
Share on other sites
Again all, thank you for your comments. Everything is appreciated.
quote:
This makes it sound like you expect fullscreen to be slower than windowed mode, when with most cards the opposite is true.
Have you tried your program in fullscreen?

I haven't, but in going back over my examples (and the optimized mesh example, as it seems to have the most primatives) I see that there's very little difference at all in the same render, be it in windowed or fullscreen mode.
quote:
1. No need to set the shader each time unless you are actually changing it.
So it would be sufficient to have it set in InitD3D() then? Sounds good.
quote:
2. Indexed buffers = much less data... For a simple quad (two triangles), you could represent it as 6 vertices (3 per tri), or 4 vertices (2 shared vertices between the tris). The savings becomes more substantial at larger scales... I'll try to look at the code later...
Thanks for the information- I am really, really regretting not giving index buffers a closer look.
quote:
My terrain program is rendering about 110 fps with 45,000+ triangles in 640x480 windowed, with some inefficient programming, on a geforce 2 Mx 1.4 gHz althlon
Marry Me? j/k
quote:
I am using index buffers and hardware transforms and the retail, not the debug libary. I am guessing it is either the debug build or the software transforms that is killing your frame rate.
Considering that all the examples I'm getting 6-figure primatives with high 2-figure fps were *also* compiled and run with the debug- I would have to say it's my software. In particular, I belive it has almost everything to do with my using plain ole' unindexed vertex buffers during render.
Well all... thanks again. I'm going to be cramming on index buffers now- anyone know of a good example/ source to read up on? I'm looking at the Dx8 library (uck) and the VertexShader example... it seems to have the clearest cut use of index buffers.
-Tok

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Feel free to email me HERE
*Howdy Kids, Do /YOU/ know what time it is? It's tangent time* -Baldor the Bold

Edited by - Tok on November 10, 2001 11:01:08 AM

Edited by - Tok on November 10, 2001 11:04:17 AM

Share this post


Link to post
Share on other sites
Code to read hieghtfield and setup vertex and index buffers. Note I keep a copy around in case I want to deform the terrain, but you can delete it if you don't need it. Sorry if its a little sloppy.

    
void CMesh::LoadHieghtMap (char * filename, INT width, INT hieght)
{
m_renderType = D3DPT_TRIANGLELIST;
//Read Hieght Field

BYTE * pbHieghtField = new BYTE [ width * hieght ];

std::ifstream file;
file.open (filename, std::ios::binary );
file.read ( (char *)pbHieghtField, width * hieght );
file.close();

SAFE_DELETE_ARRAY (m_pvVertices)
m_pvVertices = new CUSTOMVERTEX [width * hieght];
m_dwSizeofVertices = (width * hieght) * sizeof (CUSTOMVERTEX);
m_dwNumberofVertices = width * hieght;

SAFE_DELETE_ARRAY (m_pwIndices)
m_pwIndices = new DWORD [width * 2 * hieght *3];
m_dwNumberofIndices = (width-1) * 2 * (hieght-1) * 3;
m_dwSizeofIndices = m_dwNumberofIndices * sizeof (DWORD);


int x,y;
for ( y = 0; y < hieght; y++ )
for ( x = 0; x < width; x++ )
{
//read in z values from hieghtmap

m_pvVertices[ x + y * width].p.x = (FLOAT) x - width / 2;
m_pvVertices[ x + y * width].p.y = (FLOAT) pbHieghtField[x + y * width];
m_pvVertices[ x + y * width].p.z = (FLOAT) y - width / 2;
m_pvVertices[ x + y * width].tu = (FLOAT) (x%128)/128.0;
m_pvVertices[ x + y * width].tv = (FLOAT) (y%128)/128.0;
}

delete pbHieghtField;
pbHieghtField = NULL;

for ( y = 0; y < hieght-1; y++ )
for ( x = 0; x < width-1; x++ )
{

//Set left triangle indices in counter clockwise order, bottom left, top right, top left

m_pwIndices[ ( (x * 2) + y * (width - 1) * 2) * 3] = x + (y+1) * (width);
m_pwIndices[ ( (x * 2) + y * (width - 1) * 2) * 3 + 1] = x + y * (width) + 1;
m_pwIndices[ ( (x * 2) + y * (width - 1) * 2) * 3 + 2] = x + y * (width);
//Set right triangle indices in counter clockwise order, bottom left, bottom right, top right

m_pwIndices[ ( (x * 2) + y * (width - 1) * 2) * 3 + 3] = x + (y+1) * (width);
m_pwIndices[ ( (x * 2) + y * (width - 1) * 2) * 3 + 4] = x + (y+1) * (width) + 1;
m_pwIndices[ ( (x * 2) + y * (width - 1) * 2) * 3 + 5] = x + y * (width) + 1;

}
CalculateNormals ();
return;
}
HRESULT CMesh::FillBuffers ()
{
SAFE_RELEASE( m_pVB );
SAFE_RELEASE( m_pIB );


if( FAILED( m_pd3dDevice->CreateVertexBuffer( m_dwSizeofVertices,
0, D3DFVF_CUSTOMVERTEX,
D3DPOOL_MANAGED, &m_pVB ) ) )
return E_FAIL;

VOID* pVertices;
if( FAILED( m_pVB->Lock( 0, m_dwSizeofVertices, (BYTE**)&pVertices, 0 ) ) )
return E_FAIL;
memcpy( pVertices, m_pvVertices, m_dwSizeofVertices);
m_pVB->Unlock();

if ( ( m_renderType ==D3DPT_TRIANGLELIST ) ||( m_renderType == D3DPT_LINELIST )
|| ( m_renderType ==D3DPT_POINTLIST ) )
{
if( FAILED( m_pd3dDevice->CreateIndexBuffer( m_dwSizeofIndices,
0, D3DFMT_INDEX32,
D3DPOOL_MANAGED, &m_pIB ) ) )
return E_FAIL;
VOID* pIndices;
if( FAILED( m_pIB->Lock( 0, m_dwSizeofIndices, (BYTE**)&pIndices, 0 ) ) )
return E_FAIL;
memcpy( pIndices, m_pwIndices, m_dwSizeofIndices );
m_pIB->Unlock();
}
return S_OK;
}


Edited by - invective on November 10, 2001 1:57:58 PM

Share this post


Link to post
Share on other sites