This of course reduces vertex cost and increases pixel cost, which most of the time is the opposite of what you want to do [smile] but it is still a pretty cool trick to have at your disposal.
I decided to get the parallax portion of the shader up and running first. My first go at it was alright, the second and third were much better. Finally, I compared my implementation with that of Natalya Tatarchuk from ATI (a reference implementation is in the DirectX SDK). There were a surprising number of similarities, but I guess if you are trying to do the same thing then there is a likely similarity in the possible solutions.
Here is a screenshot of the current iterative implementation:
Awesomeness, nonetheless. What kind of FPS are you getting at this point?