Take a look at these two screenshots. Each screenshot shows the exact same view of a 32,768 x 32,768 heightmap.
First, here's Land-o-Rama with per-vertex normals in action, using a patch size of 33 x 33 vertices:
There are 623k triangles being rendered at a depressing 21 fps.
Now, here's Land-o-Rama with per-patch normal maps in action, using a patch size of only 17 x 17 vertices:
w00t, a huge difference in detail! Each quadtree patch has a 64 x 64 normal map generated from the same heightmap as the terrain mesh. Now there are only 302k triangles being rendered and the frame rate has almost doubled to 40 fps.
I'm not happy with how I got it to work though...
Land-o-Rama uses a patch size of 2n+1 x 2n+1 vertices. Because the terrain mesh and the normal-map texels are generated from the same heightmap data, the texture size must also be in the form 2m+1 x 2m+1 (although it doesn't need to be equal to the patch size.) This is a real problem since my graphics card does not support non-power-of-two textures in hardware (it's a 4 year-old NVidia GeForce FX 5950 Ultra.)
What I ended up doing was this: First, Land-o-Rama generates the 2m+1 x 2m+1 normal-map for the patch. But before uploading it to the graphics card, it removes the center row and column. I feel that this solution is a big steaming pile of kludge. There's got to be a better way.
What I may end up doing is rewriting the quadtree code so that the patch size is 2n-1 x 2n-1 instead. Then the normal-map size will also be in the form 2m-1 x 2m-1, which will fit inside of a 2m x 2m texture. No removing the center row and column! It will be quite a rewrite though.
Here's a screenshot with some terrain texturing added to the mix:
And here's another one with snow and cliffs in the foreground: