So, now I've read the text book I thought I'd actually get onto writing my first shader from scratch (about bloody time).
I wanted to make the simplest possible program that I could that used vertex shaders (skipping pixel shaders for now), so no effect files and nothing too adventurous. I started off with the HLSLwithoutEffects sample included in the DirectX SDK.
A little over 3 years ago I wrote up a cubic patch renderer in visual basic (Original Article) that worked by manually computing the cubic function for each vertex on the CPU and then performing a lock/write/unlock operation on the vertex buffer. The update operation was most certainly not real-time, but it didn't matter.
So I decided I'd implement my cubic expansion as a vertex shader and do the same thing in real-time. The mathematics is dead simple - basically expand out (A+B)3o(C+D)3 and put in coefficients to control the curve. See the aforementioned article for the actual formulae.
My only concern was that the length of the shader might be prohibitive for a vs_2_0 profile - but with optimizations enabled it worked out as a reasonable 64 instructions.
I've uploaded the full assembly listing and the original HLSL Shader if you're interested.
Any veteran shader programmers looking at this will probably be able to spot some dodgy bits, but I plan to use the above shader as an optimization test bed. I reckon it'll be a classic case that the D3DX compiler can write far better code than I can, but it'll be interesting to see if I can shave any instructions off [grin]
I've not got the code in a clean-enough state to put up publicly, but I'll see about cleaning it up and posting it in the next couple of days. Might well be of interest to some people...??
(click to enlarge)
I thought I'd make a brief comment on the performance of this. Whilst I've not done any extensive profiling, the last screenshot above is definitely doing a good job of stressing out my Radeon 9800 Pro's vertex pipelines [grin]
As the screenshot shows, there are 65,536 vertices in the mesh, each of which is being thrown into that 64 instruction shader. The frame rate is a respectable 147.42 fps.
That means that, as a crude statistic, the vertex shaders are pumping out a decent 9.7 million vertices every second and churning through the best part of 620 million instructions each second.
I wonder what it'd pull off with a GeForce 7800 [wow].