[MDX-C#]Vertives transformation (From Object to World space) slow

Started by
12 comments, last by sirob 17 years, 8 months ago
Hello, I'm implementing a dynamic vertex buffering system for my engine. It seems that for batching purpose I will have to transform all the vertices from Object to World space (they will all use the same vertex buffer). Until there, everything is fine. I keep the original position of every scene object inside an array and do the modification on an other one :

// Copy the original value to the temp vertices array
_verts.CopyTo(_vertsTemp, 0);

// For every vertices in the array, do the transformation (multiplied by a matrix)
for (i = 0; i < _vertsTemp.Length; i++)
{
  _vertsTemp.UpdatePosition(Vector3.Transform(_vertsTemp.Position3, _ObjectWorldMatrix));  
} 


My problem here, is that the Vector3.Transform(MyVertice, MyMatrix) fonction is really slow ! Running the application without anything to do on my computer is output 3000+ Frm/sec, If this fonction is called 1000+ times it make my framerate drop to 1000 and less ! (Nothing is done, only the recomputation of the vertices !) Is there a beter way to do these transformations ? Tx you !
Advertisement
IDirect3DDevice9::ProcessVertices() might well be quicker than the D3DX based method (which should have a faster array-based overload).

But ultimately your big problem is that you're doing CPU-based transformation. Modern GPU's are absolute monsters when it comes to vector/matrix mathematics and will easily run circles around even the best CPU's. You really should try and structure your code/engine so as to take advantage of the GPU wherever possible.

Why are you having to transform everything? Give us a bit more of an explanation as to what you're trying to achieve and we might be able to suggest a more GPU-friendly method [smile]

hth
Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

First off, FPS is a very bad way to measure speed. Framerate vs Frametime covers that subject.

Now, 1000 means 1ms per frame. 3000 is 0.333ms per frame. That doesn't make Vector3.Transform slow.

Also, unmanaged DX has a TransformArray method, which takes an array of many vectors and transforms those, but has its own limitations. MDX probably has these somewhere, not sure where it is in MDX1.1.

Lastly, why do you want to do the transformation on the CPU? Of all solutions, this is likely to be the slowest. Unless you have many, many objects with very low poly counts, even a SetTransform + DrawPrim would likely work faster. If you really need good performance drawing many objects, look into instancing. CPU transformation seems like the worst possibility here.

Hope this helps.
[EDIT] Slowpoke.

[Edited by - sirob on August 11, 2006 11:48:43 AM]
Sirob Yes.» - status: Work-O-Rama.
I can only agree with the above... But if you really want to do this on the CPU I'd suggest using

Vector3.TransformCoordinate(Vector3[] source, Matrix transform)
(see also DirectX Documentation for Managed Languages: Vector3)

This is still hell of a lot faster than doing vertex per vertex.
--I love deadlines. I like the whooshing sound they make as they fly by. (Douglas Adams)
Hello !

Tx you for your answers !

First, I'm not that far in my engine.
I try to make it as dynamic as possible, for the moment I try to put in place a game card in 3D (GameCard like Magic the gathering for who knows).

The game table with cards on it will be store inside static buffer (Card are not changing that often there).

The dynamic part will be use for the hands and the cards held by the player, a slow movement should be perceptble there, also cards moving, the action to put the card on the play table, ...

Basicaly my engine is doing now :

Static Buffering : Works very very nice, with sorting, batching, ...

Dynamic Buffering : The aim of the dynamic buffer is trying to batch as many cards together (With differents World matrix by card objects) to render them at the same time.
My question is quite simple how would you render let's say hundred of cards at the same time, but with world matrix different for all of them (and changing frequently : Move of the hand, ...) ? My solution is the fill in a dynamic vertex buffer but to be abe to render it with as less possible draw as possible, they must be in the view space (It's why I'm doing world -> View transform before I fetch the vertex buffer)

For the hardware instancing, is it working with ATI cards ? I though It was only for latest viceo card and Nvidia only ?
An other things :

The array of points where I have to change/refresh the coordinate point every frame is in fact an array of VertexFormat (Where you have basicaly a vector3, and texture mapping u and v coord, ...)

And I keep a table of these that, and not only Vector3.
That makes the TransformCoordinate not possible for all vector3 point at the same times ...
<quote>
For the hardware instancing, is it working with ATI cards ? I though It was only for latest viceo card and Nvidia only ?
</quote>


Well I think you'll have to look this up at ATI. But as far as I know, instancing is supported by NVIDIA since the 6800 model, which is pretty old.
I don't think that ATI is so far behind concerning that feature.
Alas, I'm not so well aquainted with instancing in application.

But: I'd say for your purpose it would just be fine to transform your dynamic data on the GPU/in your shader. Batching is nice, and that it works quite good for your static geometry is even better, but in my opinion the overhead for locking, transforming etc. is not worth it for your dynamic data/purpose.



--I love deadlines. I like the whooshing sound they make as they fly by. (Douglas Adams)
Hello !

So clearly for Dynamic processing I should go :

foreach DynamicObject{    Device.SetWorldMatrix = CurrentObject.WorldMatrix    Draw}=> No batching possible here[\source]Instead of :foreach DynamicObject{    UpdateCurrentObjectVector3 with CurrentObject.WorldMatrix}draw batched objects=> Batched draws[\source]
I'd like to offer a couple suggestions for possible ways of doing this [smile]:

- You could always just forget about batching. Depending on how many actual cards you need to show, it's possible that just calling SetTransform + DrawPrim per each might work out well enough to be worth avoiding the hassle of coding something more difficult. This also has the benefit of being simple, which means it has less places to break on specific hardware/systems.

- Assuming the "cards" are just flat rectangles, theres always ID3DXSprite with the OBJECTSPACE flag. This would work quite fast, and be very simple to do. On the other hand, theres little to be learnt from using a pre-made interface, so if you're interested in this as a learning experience, this might not be the way to go.

- Using CPU processing is also a valid posibility for this. That would actually be quite similar to the way ID3DXSprite does things. Frankly, I feel this would be pretty difficult to get working exactly right. The transformation is bound to be a bit slow, though there are a couple ways you can speed it up*.

- Lastly, theres the option of using shaders. This would include instancing (either pure hardware, or shader constant based). If you're interested in either of these, have a look at the Instancing sample in the SDK Sample Browser, which features both methods.
Keep in mind, however, that both methods require rather newish cards (SM3 for pure hardware, ~SM1.4/2 for shader constant), and are quite complex, which means they might break under different setups, or in different cases.

What I'd recommend you do is use ID3DXSprite, if you're not interested in actually writing this yourself, or if you are, use an Array of Vector3s as a temporary buffer for the positions when you transform them.

Hope this helps.

* Namely, using D3DXVec3TransformArray would do quite a bit of good, but unfortunatly, the MDX equivalent does not have a stride parameter, and thus can only use an array of Vector3s, and not any custom struct. I can't find any substitute for that, which is a bit weird.

[Edited by - sirob on August 11, 2006 9:03:04 AM]
Sirob Yes.» - status: Work-O-Rama.
Well, I don't have much to add, except that I (for the sake of gaining experience - and it's still quite simple) would tend to use shader-constant based instancing.
You can pm me if you like and I'll send you a small vertex/pixel shader that uses some phong-like lighting -and some c# sample code showing you how to use it- via email.

Martin
--I love deadlines. I like the whooshing sound they make as they fly by. (Douglas Adams)

This topic is closed to new replies.

Advertisement