Sign in to follow this  
Endemoniada

DrawPrimitive() Dilemma

Recommended Posts

Endemoniada    430
Hi guys,

I'm rendering 128 objects made up of 124 triangles each. Should I be calling DrawPrimitive() for each object ? The only alternative I can come up with is to call LockBuffer() and fill it up, doing all the transformations (position and normal) on the CPU, followed by a single call to DrawPrimitive().

I'm locking and filling the buffer at about 800HZ which isn't bad, but I have an I7 860 and I plan on doing more processing, like sound and particles, so it might be slow on a not-so-fast computer.

Maybe there is another way ?

Thanks.

Share this post


Link to post
Share on other sites
rouncer    294
As the pci express bus expands you get more draw calls, more memory throughput, more write than read, like ten thousand, Id have to try it out but it might be able to handle it.

Share this post


Link to post
Share on other sites
Krypt0n    4721
you can stuff all objects into one buffer, with 15872 triangles. Adding per vertex an index that identifies the object (like an ID, maybe into the alpha channel of the color, if you provide this in your vertex-stream).

then you set your 128 transformations as constants and draw all if it in one go, by indexing in the vertex shader into the constants area, using the per-vertex ID.

- I think that shall be the fastest way

- make sure you can fit enough constants in your Vertexshader/vertexprogram version, otherwise you'd need to split it.

Share this post


Link to post
Share on other sites
mhagain    13430
[quote name='Krypt0n' timestamp='1316008767' post='4861549']
you can stuff all objects into one buffer, with 15872 triangles. Adding per vertex an index that identifies the object (like an ID, maybe into the alpha channel of the color, if you provide this in your vertex-stream).

then you set your 128 transformations as constants and draw all if it in one go, by indexing in the vertex shader into the constants area, using the per-vertex ID.

- I think that shall be the fastest way

- make sure you can fit enough constants in your Vertexshader/vertexprogram version, otherwise you'd need to split it.


[/quote]

I use this approach for particles and I've found it faster than both instancing and drawing directly. It also works on a wider range of hardware than instancing, which is a nice bonus. Like you said though, limited constants space is something to be dealt with.

Don't even attempt to index constants this way in a pixel shader, by the way.

Share this post


Link to post
Share on other sites
Endemoniada    430
That's a really nice technique but I can only see it working with a fixed number of objects that doesn't change.

How do you deal with objects that no longer exist (say they got blown up by the player) ?

Thanks again.

Share this post


Link to post
Share on other sites
pcmaster    982
[quote name='Endemoniada' timestamp='1316154245' post='4862335']
That's a really nice technique but I can only see it working with a fixed number of objects that doesn't change.

How do you deal with objects that no longer exist (say they got blown up by the player) ?

Thanks again.
[/quote]
You can always discard any geometry in geometry shader, simply by not appending anything to the output stream(s). However and obviously, it's always better to avoid this at the CPU level :-)

Share this post


Link to post
Share on other sites
Numsgil    501
An interesting rule of thumb, the source of which I forget:

25000 * CPU Speed in Ghz * percentage of frame time to spend on drawl calls * time per frame = draw calls per frame

So on a 2 Ghz machine running at 60 FPS and aiming at 40% time spent on graphics, you're looking at about 320 draw calls per frame. Useful way to theorycraft how much simplification you'll need to be doing to meet your target framerate.

Not sure where the magic number 25000 comes from. I'm sure it's batches per second for a given card, but I can never seem to find numbers on that anywhere.

Share this post


Link to post
Share on other sites
Numsgil    501
Found the answer: [url="http://origin-developer.nvidia.com/docs/IO/8230/BatchBatchBatch.pdf"]http://origin-develo...hBatchBatch.pdf[/url].

Basically: batch calls per frame is CPU limited (all done in the driver), and so is independent of the graphics card or its speed. From some empirical testing, 25K is a mid to low estimate of how many draw calls a 1 Ghz processor can do per second. This exact number is dependent on drivers and other factors.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this