Primitive Performance

Started by
4 comments, last by MJP 12 years, 2 months ago
Hey everyone,
i'm quite new to DirectX programming, and have read quite a lot of tutorials.
They mostly left me with this question though:

I think i understand the procedure of rendering ONE triangle:

Create triangle vertices on init and store them in any field of my DirectX manager class
On each Draw call, do
- Map the vertex buffer
- Put the vertices in
- Unmap
- Set the world transform for this triangle
- Present
- start over

But then I get tired of this single triangle and want to start with any given number.
To do so, I assume instead of one field, i'd use a list of a custom Class CTriangle, which holds position in the world, a simple triangle vertex array, and the world rotation.
To print those, I'd loop over this list and do the aforementioned for each CTriangle.

Needless to say, this works for a list size of 10 to 100, but say I want to render 10000 triangles, this becomes very laggy.
So the point is, how should I approach this i a better way?
How should I render a long list of custom-class objects?
Advertisement
Instead of 1 draw per triangle, put many triangles into a static vertex buffer. Then you can draw all of those triangles into one draw call, and you can still translate/rotate/scale all of those triangles as one object by using a world matrix. This will be much faster and also easier to manage.
Thank you very much,
but that's not quite what I intended to do.
I suppose there should also be a way to rotate and translate each triangle individually.
Does anyone know some more suitable approaches to this requirement?

Over night I came up with a different idea:
What about loading the vertex data into the buffer once on init, and then storing the offset and length of the vertex buffer data with my objects, instead of loading them in each draw call?
Would that make my program reasonably faster, if I just needed to apply a world matrix for each triangle per draw call but not loading the triangle into the buffer?
You still really want to use a small number of draw calls. You might want to investigate the techniques used for particle system rendering as that is effectively what you have. I'd suggest one of:

1. Use a single draw call by doing all the transforms on the CPU instead of the vertex shader. This is probably the simplest option.
2. Set the triangle data up once, and use a second vertex data stream to store the matrices.
3. If possible compute the matrices on the GPU, with either the Vertex Shader, Geometry Shader, or DirectCompute.
I'm goin to try option one as soon as I get home.
I hadn't thought about this option so far.

Correct me if I am wrong, but it seems to me that option 2) would not be practicable in my case either, when I want to update each matrix on each update sequence? So I would go through the list of CTriangles, which store some yaw, pitch and roll data, generate a new Matrix, put it into the second vertex Buffer.
But then? I have not seen matrices stored in vertex buffers before?
Why would I do this? Wouldn't that mean I have to get data back from my VertexBuffer, which I have also not seen before?
A matrix per triangle is pretty silly. Think about it: you're using at least 12 floats so that you can calculate the value of 9 floats. You might as well just transform the vertex positions directly and set all of the vertex positions into a dynamic vertex buffer so that you can draw them with a single draw call.

Either way a draw call per triangle is horribly inefficient, and will never scale to anything reasonable. You need to batch your triangles in draw calls containing hundreds or thousands of primitives.

This topic is closed to new replies.

Advertisement