Sign in to follow this  
blue-ice2002

drawprimitive 10000 times

Recommended Posts

blue-ice2002    122
hello, i have a question,or maybe a problem.For some reason,i need to draw trianglestrips. s=-512; D3DXMatrixTranslation (&matTranslation, s,384.0f, 0.0f); s--; for(int v=0;v<10000;v++) { g_pd3dDevice->DrawPrimitive(D3DPT_TRIANGLESTRIP, x, 2); x+=4; } i do this in the main game loop,every iteration.Calling this function makes the program run slowly i think? Is it the problem? and i also make translation to scroll my tiles.Please help!!!

Share this post


Link to post
Share on other sites
jollyjeffers    1570
Quote:
Original post by blue-ice2002
s=-512;
D3DXMatrixTranslation (&matTranslation, s,384.0f, 0.0f);
s--;
for(int v=0;v<10000;v++)
{
g_pd3dDevice->DrawPrimitive(D3DPT_TRIANGLESTRIP, x, 2);
x+=4;
}

From looking at the code, you don't actually change the matrix for each pair of triangles drawn. Unless you intentionally *dont* want them to be connected (are each pair a seperate entity?) you can reduce that down to a single draw call unless I'm much mistaken:
g_pd3dDevice->DrawPrimitive( D3DPT_TRIANGLESTRIP, 0, 20000 );

If they are unconnected, and rendering them all together screws things up, convert them to triangle lists instead.

Bottom line - get rid of that loop! it is what is killing you, and while you have it there you're gonna get sucky performance [smile].

hth
Jack

Share this post


Link to post
Share on other sites
Codemonger    217
If you use PureDevice during hardware device creation then DrawPrimitive Calls have generally little or no overhead. Anyway I don't think that's your direct problem at all (although you should keep your draw primitive calls to a minimum), it could be a number of things such as overdraw, but you said they are tiles. Anyway I don't understand the need for a loop that does absolutely nothing except increase the start index by 4 , i mean you are not transforming inside the loop or applying different textures ?? maybe i'm missing something. If you have more code pls. post more.

Share this post


Link to post
Share on other sites
jollyjeffers    1570
Quote:
Original post by Codemonger
If you use PureDevice during hardware device creation then DrawPrimitive Calls have generally little or no overhead. Anyway I don't think that's your direct problem at all (although you should keep your draw primitive calls to a minimum), it could be a number of things such as overdraw

hmm, gotta say I disagree... rendering 2 triangles per call (whatever the type of device) will probably make ATI and Nvidia developers cry [smile]. Read up on some of their technote presentations on how to get the most from their respective hardware. I don't have time to search through them all right now - but numbers like 1000-triangles-per-call, no more than 10,000 calls per frame etc pop-up. I don't think I've ever read anywhere that, under any circumstances, a DP call is "generally no overhead" [sad]. Sorry!

hth
Jack

Share this post


Link to post
Share on other sites
jollyjeffers    1570
Quote:
In the one I posted, it says "DrawPrimitive() permissible if warranted."

Yeah, having read on a bit further - thats a more relevent paper than the one I quoted. Got a lot of good stuff, although the numerous batch/triangle graphs are too cluttered to be easily readable.

I like Slide #14 - "Small batches murder your performance" [smile]

Cheers,
Jack

Share this post


Link to post
Share on other sites
blue-ice2002    122

thanks people,

i changed or in english eliminated the loop

i changed to trianglelist & made
g_pd3dDevice->DrawPrimitive(D3DPT_TRIANGLELIST,0, 20000);

it changed to this,i mean i added 2 more vertices for each 2 pairs of tirangles,but i took the chance of making only 1 single draw call,like u guys suggested.thanks :) and read the .pdf s u gave a little and it says every draw call is a cost :)

now my problem with optimization isnt finished,im continuing
now the problem is,i have a map[100][100] for textures of tiles and in compile time i create a vertex buffer that is 6*100*100=60000
and then set textures.and then render and scroll the map.its 100x100 triangles.
and because i call draw for 20000 triangles,then scroll,there is only 1568 triangles seen at each time,but i dont know how can i select those then render from vertexx buffer in a correct order.

i mean,i call draw for 20000 triangles,but i only have 1568 triangles in the tilemap that fits on the screen at a time.
so it clips internaly or what?giving more than i can see is a problem?

Share this post


Link to post
Share on other sites
Codemonger    217
yes you are better off sending the whole thing and letting the GPU vertex pipeline handle the clipping or tossing of unused vertices. If you find that you are having a performance problem, you could create patches of index buffers. In your case as it is now brute force is probably the best bet, if its working for ya. If you need to dolots of texture switches for different triangles or quads, use a texture atlas. So for your tilesyou will have one Vertex Buffer, and you will use one texture. anyway those are just my quick thoughts.

Share this post


Link to post
Share on other sites
KrazeIke    256
If you can divide your map into sections and somehow cull large groups of vertices at a time, you won't have to pass as much data to the GPU, which is good.

Share this post


Link to post
Share on other sites
circlesoft    1178
This slideset says that you only get 10,000 - 40,000 batches per second. So, your performance is definetly going to be terrible [wink]. Take a look at that slideset for some more good info on batching.

Share this post


Link to post
Share on other sites
blue-ice2002    122

yeah,i made 1 picture for 2 textures,i map, via texels.

with tiles,
i made a so big array for scrolling,and holding vertice data,but all the shapes of the tiles are same,they are <> this,i mean diamonds,made of 2 triangles each,but the reason i create so many vertices is,to hold different texture data for different tiles,i mean one is grass,one is dirt.

but also ,i want to give heights to some tiles,i mean i will distort y components of quads and make the illusion of hills,

so do i need to only hold the vertice data that is for rendering 1 screen,
and then at real time lock that and add height and texture data from another array that holds only these two?

Share this post


Link to post
Share on other sites
Codemonger    217
Quote:
Original post by blue-ice2002

yeah,i made 1 picture for 2 textures,i map, via texels.

with tiles,
i made a so big array for scrolling,and holding vertice data,but all the shapes of the tiles are same,they are <> this,i mean diamonds,made of 2 triangles each,but the reason i create so many vertices is,to hold different texture data for different tiles,i mean one is grass,one is dirt.

but also ,i want to give heights to some tiles,i mean i will distort y components of quads and make the illusion of hills,

so do i need to only hold the vertice data that is for rendering 1 screen,
and then at real time lock that and add height and texture data from another array that holds only these two?



The vertices in your Vertex Buffer should have your texture cords already pre-set based off of your texture atlas of terrain (grass, dirt, etc.). Your Y axis should also be preset to whatever the heightmap requires for that tile. Because you are using a vertex buffer, you would only want to use one VB for the map. If you are using a top-down view (tiles) then you should look into using a quadtree and Index Buffer patches to maximize the efficiency of what actually gets passed to the piepline.

Remember the GPU uses virtually no overhead for discarding vertices, its sending them to the pipeline that requires lots of time.

Hope I understand what you are trying to do.

Share this post


Link to post
Share on other sites
jollyjeffers    1570
Quote:
Original post by KrazeIke
If you can divide your map into sections and somehow cull large groups of vertices at a time, you won't have to pass as much data to the GPU, which is good.

As KrazeIke says, its all about culling large groups of vertices.

Theres an emphasis on being able to eliminate large groups as you can (in theory) remove 1000's of triangles with one "cull" - a potentially more precise method of checking every single triangle for culling goes full circle and still gives you a nasty fine-grain loop again [smile].

You might well want to have a look into the "Quadtree" algorithm - its quite well suited to efficiently culling large, regular, tile-like grids. This article from this very site is a pretty good introduction to the topic.

hth
Jack

Share this post


Link to post
Share on other sites
Armadon    1091
blue-ice2002, You might want to consider using Dynamic vertex buffers for your problem. Or buffer you data and then drawing the data/buffer as needed

Share this post


Link to post
Share on other sites
Codemonger    217
Quote:
Original post by blue-ice2002
hello ,

my tilemap is this.

its 200*200 quads.
i want to render&scroll it efficiently,but as u said i should draw only the one that are on screen,not the whole map.
but how? the quadtree is a good example,thanks.but mine has no camera,no z component.
its top-down & uses orthogonal projection.


Like I said the best thing for you to do is to use one Vertex Buffer, do'nt use a dynamic vertex buffer, because the vertex information does not and should not change. Now using the quadtree algorithm you can divide the vertex buffer into smaller chunks (smallest being the size of the screen). These smallest patches will be Pre-Computed Index buffers. So using quadtree algorithm it will tell you what index buffer patches to render. So you need:

1) One giant Vertex Buffer (for terrain - tiles)

2) One giant Texture - will all types of terrain in small patches

3) Quadtree algorithm breaking down into smallest size (# verts on screen)

4) Index Buffer's make up small chunks on screen

With this only 4 possible Index buffer Chunks can be renderedon the screen at once, using multiple patches of terrain texture. Hope this helps.

Share this post


Link to post
Share on other sites
blue-ice2002    122


thanks,that helped some,i mean i understood what u want to come up with,
but of course i have some things in my mind that make me think.That is

if i use INDEX buffer,and then reference to the vertices with theire numbers,
we come up with a problem with texturing.that is this

<><><><><><><>
<><><><><><><>
we have the tiles,for example we are at tile 1.
we matched the indices with vertices in correct order.
but in the texture,the last vertice of the first tile is also the first vertice of the second tile,so how will i match the texture to each quads?

Share this post


Link to post
Share on other sites
Codemonger    217
Well number one you should not use triangle strips ... if you use straight triangles then the GPU will cache them much much better, caching vertices is half the battle for speed. Triangle strips are used to conserve bandwidth of sending vertices down the pipeline, I don't think modern GPU's are focusing on Triangle Strips anymore as much as they used to.. so change your plans to work with straight plain old triangles. This then means edges of Index Buffers can point to overlapped triangles without a problem. So your texture cords, can be somewhat independent if needed. I wouldnt worry about triangle count too much, when people talk about culling they are actually talking about gross culling, but we don't care because you are working with a lump of vertices that don't change, one Vertex Buffer call. Occlusion, is the real performance enhancer, because it reduces overdraw, not just sending stuff down the pipeline, but we don't have to worry about that either.

Anyway I wouldnt worry about # triangles u send, I would focus on getting those triangles that are in vid. memory to the screen as fast as possible. Now using Index Buffers, works well with GPU cache and again like I said because of style of camera view etc. you limit yourself to sending 4 index buffers max, which is excellent. Using overlapping vertices, you can get away with independent texture splattering without overhead. And because you are re-using Index Buffer patches (scrolling at steady pace) the GPU will love you.

Hope that helps. These are just my quick thoughts anyways.

Share this post


Link to post
Share on other sites
elpool    122
of course you could always use dynamic vertex buffers, which would be slower but depending on your game it might not matter. Im using a dynamic vertex buffer for a similar top-down view tile based game, and Im pushing 800 tiles at about 600 fps on 2 year old hardware. For my simple game, the inefficiency of dynamic vertex buffers doesn't become a problem. This allows me to use more than one texture on the map, as the tiles are sorted by texture before going into the buffer, and then using one drawprimitive for each texture. it also allows me to dynamically change the map quite easily.

if you're having framerate problems though, dynamic vertex buffers could be out of the question. just something to consider.

Share this post


Link to post
Share on other sites
blue-ice2002    122


hello,thanks for explanations aobut frame rate.maybe this could be the reason for the speed issue.
i mean in the main game loop
i do this

gamemain()
{
drawprimitive(20000)
}
should i give it a rate limit,i mean other than calling draw at each time,should i set the frame rate to a constant limit,for example
30 frames per second,means call draw 30 times in a second,cause for human being its enough.if i dont limit this,it calls milions times maybe,since there is no need for that much comoutation every milisecond?

Share this post


Link to post
Share on other sites
DrunkenHyena    805
Quote:
Original post by blue-ice2002
call draw 30 times in a second,cause for human being its enough.


That depends on the game and the human being.

Quote:

if i dont limit this,it calls milions times maybe,since there is no need for that much comoutation every milisecond?

Unless it's causing a problem, well...what's the problem?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this