Sign in to follow this  
yckx

D3D10 Index buffer sizes

Recommended Posts

yckx    1298
[b]Edit:[/b] I meant vertex buffers. As is typical I posted this well past my bedtime, and mental wires got crossed.

I was looking at my rendering code, and looking at ways to improve it, and thought that if I filled one [s]index[/s] vertex buffer that was six times as large, I could use a single DrawIndexedInstanced call instead of looping through a fill-[s]index[/s]-vertex-buffer-then-draw loop six times. I figured fewer draw calls are preferable. But when I tested it, my framerate went down. Not much, but it's not what I expected. Then I remembered that in the Draw call you can index into the buffer, which made me realize that I was probably shoving too much data into one draw call, and should send it in batches.

But I haven't seen any guidelines on how much to the GPU in a Draw call. Which makes me think it's completely dependent on scene and hardware, which means however I optimize it on my machine, it may all be for naught on other systems.

So, is there any guideline, or should I just find a batch size that works for me on my machine and be happy with it?

I've gone a little off subject--I just thought of batching as I began typing this. What I really wanted to ask was: When creating a buffer and specifying the size of the buffer in D3D10_BUFFER_DESC.ByteWidth, can a buffer be too big? Right now I'm creating one that's sizeof(D3DMATRIX) * 900, and if I put all the pellets into a single buffer I'm going to end up needing almost twice that amount. It feels like I may be approaching a limit past which performance will degrade, but I haven't come across any material actually saying that. It's just a feeling. So, is it kosher, or should I amend my ways?

Thanks.

Share this post


Link to post
Share on other sites
e3d_ALiVE    209
there are 2 limitations on creating buffers[whatever they be invex/vertex/texture or just buffer]
[list=1][*]IT MUST be aligned to 16, if it's not that creation will fail[*]for d3d10 the buffer ByteWidth or totalsize must be less then 128*1024*1024 aka 128MB[/list]p.s. i'v tested with idea of putting entire mesh into big buffer, it will fit in most cases
you can improve perofmrnce by using texture2darray like texture atlas in d3d9 that will singificantly reduce draw calls
your send your world matrix much less, if you also put it into buffer, and index to it from shader or you can put it directly via vertex streams specifying flag per instance

you should do perfomance tests on your app, but in general reducing complexity of shaders and reducing draw calls is a good idea.

Share this post


Link to post
Share on other sites
yckx    1298
[quote name='e3d_ALiVE' timestamp='1302246496' post='4795860']
[list=1][*]IT MUST be aligned to 16, if it's not that creation will fail[*]for d3d10 the buffer ByteWidth or totalsize must be less then 128*1024*1024 aka 128MB[/list][/quote]
I knew about the first rule. I was unaware of the size limit. Thank you for that.

[quote]p.s. i'v tested with idea of putting entire mesh into big buffer, it will fit in most cases
you can improve perofmrnce by using texture2darray like texture atlas in d3d9 that will singificantly reduce draw calls
your send your world matrix much less, if you also put it into buffer, and index to it from shader or you can put it directly via vertex streams specifying flag per instance

you should do perfomance tests on your app, but in general reducing complexity of shaders and reducing draw calls is a good idea.
[/quote]
Well, I'm only using one texture at the moment, but I'll keep that texture2darray tip in mine. I'm not sure I follow what you're saying with the world matrix. My view-projection matrix is in a cbuffer in the FX file, and is set with ID3D10EffectMatrixVariable::SetMatrix() per frame (or per shader per frame). Each object's world matrix I'm already sending as a separate vertex buffer per instance, since typically I have several hundred of most objects to draw. (They're simple objects. I think the most complex that's instanced like this has twenty four vertices.) Are you recommending a different approach?

Share this post


Link to post
Share on other sites
e3d_ALiVE    209
the main perfomance problem is draw call and send call
->set matrix/texture
->draw
repeat
it might be very efficent to do this

select group of objects, that has same texture size
create texture2darray from them
create matrix array for them[for d3d10 you can see sampe in nvidia blendshapes example, or look at the forum, i posted the code long time ago][for d3d11 level 10 there it's called StructuredBuffer[you only need read only!, no need for UAV][i think you already doing similar stuff anyway]
somehow put them into single vertex and index buffer, can be done with d3dxmeshconcatenate i think

then in render:
set texture array
set matrix array[you can set it as shader buffer or as second stream]
*set array of indices of which mesh in that big array is visible[frustum culling][kill them in geometry shader+possibility of occlusion integration with HZB]
draw batch

that method will singificantly reduce draw calls for static geometry[since it can be preprpocessed]
for dynamic geometry you gonna be needing ->UpdateSubResource or Better ->Map with discard flag, but it might throttle bandwitdth

generall it's better to have less draw cals

p.s. with that frustum killing you might restructure you code, so your VS be passthorugh and GS actually executed VS only in case it was't frustum culled, but it's up to you, if you don't have lot's of polygos in frame u don't need it.

Share this post


Link to post
Share on other sites
yckx    1298
Thanks for the advice. I'll have to take another look at my render code when I get back home on Monday. I'm without my machine for the weekend.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this