rotation using quaternions in a shader

Started by
5 comments, last by Guoshima 17 years, 9 months ago
Hello, I want to use quaternions to represent my rotation of an object which is rendered using instancing, because a quaternion only takes 4 floats, while a rotation matrix uses 9. But what's the fastest way of multiplying a float3 (position) with a quaternion which represent the model rotation. Regards, Kenzo
Advertisement
Whatever it is it'll probably be slower than doing the vector-matrix multiply, GPUs are *really* good at that [smile]
"Voilà! In view, a humble vaudevillian veteran, cast vicariously as both victim and villain by the vicissitudes of Fate. This visage, no mere veneer of vanity, is a vestige of the vox populi, now vacant, vanished. However, this valorous visitation of a bygone vexation stands vivified, and has vowed to vanquish these venal and virulent vermin vanguarding vice and vouchsafing the violently vicious and voracious violation of volition. The only verdict is vengeance; a vendetta held as a votive, not in vain, for the value and veracity of such shall one day vindicate the vigilant and the virtuous. Verily, this vichyssoise of verbiage veers most verbose, so let me simply add that it's my very good honor to meet you and you may call me V.".....V
yip .. but for the moment I am setting my per instance data as const values for the vertex shader and copying upto 21 floats per instance seems rather slow to me. (and I think I only have a 1024 constants so when I use 21 floats I can only render 48 instances with 1 drawcall, and if I use quaternions I "only have to use" 16 floats which gives me 64+ instances)

Regards,
Kenzo
Rotating using Matrices is 6 additions and 9 multiplications. Rotating using quaternions has the following costs:

Generic quaternion multiplies: 24 add, 32 mul
specialized quaternion multiplies (best case): 17 add, 24 mul
convert to matrix, then transform: 18 add, 21 mul (including conversion cost of 12 add and 12 mul)

As you can see, the best solution is probably to convert the quaternion to a matrix and use that to rotate the vector. It's obviously slower than using a matrix directly, but as you say 4 floats is less than 9 (or 16 as the case may be)


Edit: Note: be sure to check out the cost of converting from matrix to quaternion in the pdf below.. It could get pretty expensive esp. since you have to do it on the CPU.

source: site: Geometric tools. link: Rotation Representations and Performance Issues

[Edited by - frostburn on July 11, 2006 7:12:20 AM]
He's wanting to send the quaternions to reduce the number of constants sent to the GPU, so converting to a matrix on the CPU won't save him anything. Converting to a matrix on the GPU would be more expensive than just doing the quaternion*vector*inv(quaternion) required for the rotation. I think the cheapest way would be to do it like this:
let q = rotation quaternion = (w,x) where x is a 3d-vector of the axis componentslet v = 3d-vector to rotatelet a = cross(x,v) + w*vv' = (cross(a,-x) + dot(x,v)*x + w*a) / dot(q,q)

So you've got 2 cross-products, 2 dot-products, 3 vector-scaler multiplies, 3 vector additions and 1 vector-scaler division.

EDIT: I *think* I did the working correctly for the q*v*inv(q), but no guarenttees [smile]
EDIT2: dot(q,q) should be 1

[Edited by - joanusdmentia on July 11, 2006 8:55:00 AM]
"Voilà! In view, a humble vaudevillian veteran, cast vicariously as both victim and villain by the vicissitudes of Fate. This visage, no mere veneer of vanity, is a vestige of the vox populi, now vacant, vanished. However, this valorous visitation of a bygone vexation stands vivified, and has vowed to vanquish these venal and virulent vermin vanguarding vice and vouchsafing the violently vicious and voracious violation of volition. The only verdict is vengeance; a vendetta held as a votive, not in vain, for the value and veracity of such shall one day vindicate the vigilant and the virtuous. Verily, this vichyssoise of verbiage veers most verbose, so let me simply add that it's my very good honor to meet you and you may call me V.".....V
AFAIK a single quaternion-vector product could be done on a CPU sequentially w/ 16 ADDs and 15 MULs, due to an algorithm using 2 cross products, a scalar addition, a vector scale, and 3 vector additions. On a GPU the vector scale and vector additions are single OPs, and I assume the cross product is provided by the GPU as well. So it may happen that a total of 8 OPs are sufficient on a GPU (but I'm not sure).

However, a quaternion-matrix conversion is needed to be done once for all vertices but only if the same matrix could be applied to all vertices. AFAIK this prohibits the conversion to be done on the GPU, since it isn't possible to pass parameters from one shader run to the next.

So, if the memory consumption and bandwidth wouldn't be a limitation, in fact this would become a question of where the break-through is w.r.t. the count of vertices to be transformed.


EDIT: joanusdmentia came to the same conclusion but a bit faster :)
thanx for the help. So in the end it isn't so expensive to do the rotation then using a quaternion (a little bit more than 2 times more). It's for object of maximum 24 vertices so this shouldn't be the bottle neck then.

Gonna try it out later!

Regards,
Kenzo

This topic is closed to new replies.

Advertisement