Best way to add a wind sway via shaders?

Started by
9 comments, last by foreignkid 17 years, 11 months ago
I’ve always just done a simple vertex position translation based on height and using the good old sin function. I don’t recall where but I *think* I remember reading that sin/cos only takes a single cycle on the GPU. If this is true I would imagine this would be the fastest way to achieve a sway in the wind. Can any one confirm this, or know of a faster way? Using sin’s to offset every vertices makes me nervous (as I’m used to them being fairly slow on the CPU).
Advertisement
I don't think that's right, but I could be totally wrong. You could use the first couple terms of the Taylor series to approximate sin, and it should be fast.
Quote:Original post by okonomiyaki
I don't think that's right, but I could be totally wrong. You could use the first couple terms of the Taylor series to approximate sin, and it should be fast.


Well the GPU uses taylor series to approximate sin, and iirc they use 15 instruction slots for that. You can try less if you wish, but I don't know how good it would look.

skow, I don't think it's that bad, that's only one sin in the vertex shader.. As always, profile it, judge after that.

If you come to conclusion that it's too slow: If you have several identical objects that need the displacement, you could try to do it in the CPU side or create a custom (non-orthogonal, unfortunately) transformation for the world matrix (but note that you will need inverse transpose of that matrix to transform covariant vectors)

And as final note... most applications aren't vertex shader-limited, but it's the pixel shaders and filling that give them hard time, so it shouldn't be that bad.
Quote:Original post by clb
most applications aren't vertex shader-limited, but it's the pixel shaders and filling that give them hard time, so it shouldn't be that bad.


A very good point.
You can always calculate the sine on the CPU and just set it as a shader constant. The disadvantage of this is that everything sways in exactly the same way, you can use randomized weighting so that some stuff sways more than other stuff however everything's still in phase. Passing a time and having a random offset added to that (as in each thing that's swaying has it's own fixed offset that is calculated randomly when you're generating the vertex data) and then taking the sine of that time, combined with random weights can give a nicer effect.
According to this page, the X800 series is capable of single cycle trig instructions, and NV's GF6 and GF7 series are almost certainly capable of the same thing.
SlimDX | Ventspace Blog | Twitter | Diverse teams make better games. I am currently hiring capable C++ engine developers in Baltimore, MD.
And according to several performance documents on the nvidia developer page, GeForce FX and up cards support trig assembly instructions. A single instruction does not necessarily imply a single cycle, but if you're worried about instruction count, a single instruction is nice. You also don't know whether or not the compiler is compiling your call to sin() into the SIN instruction or into a series of instructions that approximate the taylor series, but given that the compilers are written as part of the drivers, you can have high confidence that the compiler uses the SIN instruction.

I would expect that the trig instructions on newer nvidia hardware (GeForce 6/7) are single cycle instructions, simply because ATI has supported single cycle trig instructions since the x800 card, as the above poster pointed out.
According to MSDN, the ASM sincos instruction takes 8 instruction slots. You get both the sine and cosine values from your input. It does however require at least vertexshader model 2.
It seems reasonable to assume that the high level shader language sin-instruction translates to the same opcode as the ASM instruction sincos.

Link

Ati has a sample for vertexshader model 1.1
Link (.PDF)



[Edited by - Anthenor on May 1, 2006 4:47:35 PM]
Quote:Original post by Anthenor
According to MSDN, the ASM sincos instruction takes 8 instruction slots. You get both the sine and cosine values from your input. It does however require at least vertexshader model 2.
It seems reasonable to assume that the high level shader language sin-instruction translates to the same opcode as the ASM instruction sincos.

Link

Ati has a sample for vertexshader model 1.1
Link (.PDF)



I looked a little deeper into the MSDN documentation that you posted and found that guaranteeing your input value is in the range of [-pi, pi] requires four more instructions, so that would put sincos at 12 instruction slots if the compiler inserts the bounds-checking assembly :(
It seems 2.0 cards added the instruction sincos to both vertex and pixelshaders. Only restriction is the angle needs to be within [-pi, +pi].

So, on 2.0 and newer cards: one instruction. On a 1.x card it will expand to lots of instructions.

This topic is closed to new replies.

Advertisement