#### Archived

This topic is now archived and is closed to further replies.

# Very fast & dirty rotations?

This topic is 5249 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I''m still strugling with the physics of my particle system. This time I want to make the particles rotate around a vector. I read a lot today about rotating points about arbitrary vectors but most explanations are eighter too mathematical for me to comprehend (mathworld, etcetera) or just give the general solution: convert to quaternion -> apply rotation -> convert to matrix3x3 -> apply to vector. That sounds so awefully slow. I need a really quick and dirty way because I can have up to 4096 particles per frame (and loads more when I get myself a real video card instead of a TNT2), each with a unique position and a unique rotation vector. In short: I''m looking for a function that takes a point, a (unit)vector and an angle and returns the rotated point. Accucary isn''t important, speed is. Any idea''s? Sander Maréchal

##### Share on other sites
i know this isnt the answer you want, but if you dont mind losing some accuracy and you want some speed, why don''t you group bunches of particles together at various positions and rotate all of them around an arbitary axis, and have like 32 of these groups, each containing 128 polygons. if you do it this way, you might be able to make up for some of the speed done with the calculations

##### Share on other sites
Don''t you need center of rotation too?

Anyhow, one way to do rotations is to calculate the cross product of angular velocity vector and the vector from the center of rotation to the point you''re trying to rotate. Then add the resultant vector to the point being rotated.

Hope that helps..

##### Share on other sites

// Somewhere Global or in a Class
int *sin_tab = new int[360];
int *cos_tab = new int[360];

// in the Init Function
for (int i = 0; i<360;i++)
sin_tab = (int)sin(i);
for (i = 0; i<360;i++)
cos_tab[i] = (int)cos(i);

Fast & Dirty...

##### Share on other sites
I''m not sure if you''ve already read this, but this has a good explanation of rotation about an aribrary axis using matrices.

##### Share on other sites
Thanks all.

@Tooot: That article should be renamed "rotation around an arbitrary axis that is parallel to one of the main axes.

@Jankey: I knew that trick already. Been using it since my BASIC days some ehhh... 14(?) years ago...

@bpj1138: That sounds more promising. I''ll take a look into angular velocities.

@Uthman: That would be a goodsolution if the stream of particles is pretty steady (and the general velocity somewhat the same). robjem is, I want the particle system attached to the payer (which moves a lot). So almost every particle is going to have a unique axis of rotation. I could group at most 8 or 16 together, but that would mean tracking and clustering 4096 particles in 256 - 512 groups. I think that the increased time of tracking and clustering all of these will offset the speed gain by doing multiple rotations at once.

Sander Maréchal

##### Share on other sites
@Sander: ok, no newbie.
Had the same problem as you a few months ago:

In short: I''m looking for a function that takes a point, a (unit)vector and an angle and returns the rotated point. Accucary isn''t important, speed is. Any idea''s?

Look for a Advanced Quaterion Lib, to Convert the Vector into a Quation, multyply both and then, Quaterion.RotateGL();

Dont Learn to Understand a Library, just use it

##### Share on other sites
You should have asked to Dr Maths

OK each particule has a unique axis of rotation.
But in the end how do you want to render such particules ? Quads that may appear flat in front of the viewer ? Or always as impostors ?

In anyway if you want to transform quickly. The problem is matrix rotation. That's slow and this denormalizes the matrices. So you have to use quaternions instead. W (Omega) = rotationnal speed as a quaternion. Now you can transform your quad vertices directly as P'=W*P, it's as fast as a matrix. But beware that precision errors after each rotation may scale your quads. Another slution is to maintain Q (Position) as a quaternion. Make a quat mul each time. then same thing but accelerated since you quad verts are box (+-r,+-r,0) corners. That's the best solution.

BTW it's nearly equivalent to the cross prod trick of bpj. ^ seems more seducing tho but beware that it scales your quads continuously.

            b            I  cross prod added (bpj)o____________a ob>oa necessarilly (Phytagoros). thus you need to rescale ob each time by a constant .... which gives the exact quat*point formula ....

There can be different solutions depending on if you want to maintain you quad vertices or recompute them on the fly by maintaining a position instead. Both solutions cost nearly the same ( around 16 floats all counted ).

Else maybe it could be worse thinking 2D and impostors, in the z planes of the impostors. Then simple cached trigonometry as someone mentionned here.

Well you know where we can continue this discussion in details.

[edited by - Charles B on September 3, 2003 2:42:08 PM]

##### Share on other sites
Woah.. This is getting interesting

@Jankey: Quaternions sound good, especially if I don''t neet to convert them to full matrices before I can apply them to my points. I assume that my "Quaterion.RotateGL();" you mean calling glRotatef() and let the video card handle it. If that is what you mean than it won''t be possible. (If you mean something else, please let me know.

First, it would use up 4096 glPushMatrix(), glRotaef(), glPopMatrix sequences (or 8192 glRotatef()''s if you invert the rotations). That''s going to be a major slowdown on the graphics card, which already suffers from the sheer amount of particles. Second, it would force me back into OpenGL Immediate mode, calling 4096 glBegin/glEnd pairs for every particle (or a 4096 glDrawArrays() call with only 4 elements in it). At the moment, I use regular arrays and only one call to glDrawArrays, which is much faster.

@Charles variation of BPJ: That looks very good. I indeed want to rotate only the particle position (one point), since I split the update function into two parts. One part particle physics update, one part buffer filling (where the position get exploded to 4 vertices). I might go with BPJ''s original though. I actually *want* my particles to scale. Now I build in a separate scaling factor to do this for me using randomized vectors perpendicular to the axis of rotation (which is slow at initialization and marginally fast at update time). Looks like I can actually get an added rotation for the cost of one dotproduct and a vector-to-quaternion per particle.

Things are looking up for me!

Sander Maréchal

##### Share on other sites
I dont know opengl but in Directx there is a vertex format called
transformed lit vertices (D3DFVF_XYZRHW ). In this format no world , view and projection transformation is applied to vertices. This is very useful to drawing quads to screen without calculating appropriate matrices.

In a partical system I think you can calculate the positions of the
I think there will be a matrix vector multiply to find x,y,z in camera space and the two projection calculation for each quad.

I did not try this but it seem very logical to me. But I dont know is there exists such a vertex format.

##### Share on other sites
Nope, in OpenGL you have to do everything yourself. No predefined vertex format here. I calculate the particle veretex positions by adding and/or substracting the camera''s up and right vector from the particle position. The camera''s vectors are ofcourse provided by OpenGL.

Sander Maréchal

##### Share on other sites
Oh and BTW, I wrote that quickly. But of course in pure maths conventions to rotate a point by a quad it's Q*P*Q^-1. Don't fear this formula it's speedy. In fact it nearly gives the cross prod formula with the right scaling to maintain a pure rotation. What I meant by Q*P is a matrix analogy in fact a function like :

QuatRotatePoint( Pout, Q, Pin );

Now I understand your problem better. I thought you wanted the quads to rotate around their center, not around the player. Which is totally different. BTW do you wish all your axes pass by a given point attached to the player ? Or may they be different centers for each rotation ?

Then Uthman's idea was great. You could use group transfos with hardware caps. Sadly you also want to keep your quads facing the camera, which makes it all impossible like this. Still you can use the same idea in software. Then you can speed it up by coding an optimized loop (SIMD?) for the same transfo. Then you have 3 possibilities remaining : cross prod, matrix or quat.

I think the most convenient here would be a 3*3 or 4*4 matrix since this enables you to shape much more transfos and make your code more general. In perfs terms, all the transfos are nearly equivalent specially if you want to add a scaling to the cross prod. Same number of ops roughly. You can also implement a version for vertex shaders to speed it up.

There is an old style (Amiga like) trick that could speed it up tho. You could used cached transfos in fixed point. Imagine all your particule centers start on discrete local coords (3 bytes).
Then accumulate the matrix transfo each frame, then get the new center like this :

uint xyz; // 10 10 10 coords. (for 1 meter radius 2mm precision)
xyz = TransfoX[(uint)p.x];
xyz += TransfoY[(uint)p.y];
xyz += TransfoZ[(uint)p.z];
unpack( QuadCorners, xyz, RCamX, RCamY ); // Can be hugely speeded up, specially if you work in camera space.

This may give all in all around 30 clock cycles to form your 4quad vertices. Also, you don't need to update your particules, only the group matrices.

You'll need 256*3 additions to create the tables each frame. But that's very speed, you can even use smaller luts (32 coords ?). Specially if you use the transfo for let's say numGroup > 64 particules, it really pays. Note that like this you can also use far less mem for the particules.

I think that's what you wanted, fast and dirty ? That cant' be beaten I think

Another track could be a physics like model, central forces, gravity like. Only the initial speed will condition the trajectory : circle, ellipse or spiral.

Acc = -Pos*1/norm(Pos);
Vel+= Acc*dt;
Pos+= Vel*dt;
You can use quick 1/sqrt(dist2) tricks with Taylor series and movemement continuity. I can explain this to you later. What's better here is you can handle collsions. That's probably the most elegant and speedy solution of all.

[edited by - Charles B on September 4, 2003 5:28:16 AM]

##### Share on other sites
Hmm, this problem also could be solved with a Tree, my idea is to sort, the Tree in that direction that Particels, which have the Same roations are rotated together...

An other way could be to Calculate the Hole vertex with the Intel IPL, which is designed to Calculate( Multiply, Add, Subtract, Divide, usw. ) image Array''s and is really fast, because using internal Codes & Structures of the Intel CPU, but the Disadvantage of that is that on an AMD PC you have to do it per Software or find a Good Speed Up Lib for AMD.

##### Share on other sites
The vector that I want the particles to rotate around is the orientation vector of the player at the moment the particle was emitted. Ofcourse, a few milliseconds later, the player will have moved and rotated, emitting new particles in it''s wake, with a new orientation vector. To be more graphical here''s a pic of the particle system I want to apply it to:

This rocket is going straight forward, but when attached to the player (or a guided missile for example) it can go in many directions. What I try to accomplish is to make the smoketrail swirl around the center of the trail at that position.

The tree sounds like a good solution to sort the particles by their general orientation. I think I could group the particles together that have been emitted in roughly the same timeframe (say 0.1 or 0.2 seconds). Those particles will roughly all have the same rotation and scale (since the dot-product trick will also scale the particles for me, I can''t use bigger timeframes if the rocket goes straight ahead). I can then rotate each timeframe group with the optimizations that Charles provided. This sounds like a speedy solution Wow!

Sander Maréchal

##### Share on other sites
When your ready with that, and this works, can i see the code?

Would be interessting if this works for my application: Using the Smoke from a Rocket in a Tunnel... Particles with Collision Detection...

##### Share on other sites
Could also be Faster if a Particle is "Far Far" away, you can Draw the Particle as "Point" with the Mean Value Color or the Inherited Texture... just an idea...

##### Share on other sites
Just code and optimize Navier-Stokes equations ion vortices then No thanks.
Well I had more te picture of whirlwinds or willowisps around a character.
However, what you want then is a kind of spiral movement.

Not sure it won''t be noticeable if you group near transfos during 0.2 seconds. but then you could add linear interp (like bones) to your transfos :
VPacked P; // Will be a weird kind of hacked double (all platforms)
P = float3( r0*(T0X[p.x] + T0Y[p.y]) + r1*(T1X[p.x] + T1Y[p.y]) ) + C;

r0=1-r1, C being the center of the translating circle. That''s symbolic code, not the exact implementation but it''s possible to use a fast scaling trick (r0*) in 64bits precision with the FPU. That''s what I used to do bilinear filtering a long time ago. If you use 6 bits for r0 [0..63] you''ll still have 10 bits available for the x,y,z components (2mm as I already said for 1 meter radius). I use the 51 lower bits of the doubles for my REALLY dirty (but solid) tricks.

I assume I can get rid of the initial z component since your particules will describe a circle perp to the initial axis. So that''s still much faster than matrices or anything else and you need only to keep r0, p and C (12bytes) for each particule and one matrix per group (memory) !

Else the physics point of view could still also compete to describe a spiral. The fact that you add a linear velocity just transforms the circle into a spiral. It''s still probably the best and most elegant solution, but you need to maintain the center, center velocity and particule relative velocity (9 floats) + a float for 1/R (*). Beware numeric precision and discrete time steps issues. That should be ez solve however.

(*) Stupid me just cache that variable, no need for any Taylor computation of
1/sqrt(CP*CP)