I'm still using forward rendering, largely because I'm targeting mobile GPUs. I'm keeping my permutations down largely by making assumptions in each of my shaders. e.g. My skinning shader currently has 20 or 30 max bone versions, 1, 2, 3 or 4 weight versions, 0, 1, 2, 3, 4 light versions and fog on or off versions making 80 permutations. The different permutations are created by compiling at load-time with different #defines and I have a class which manages selecting the appropriate shader program and making sure it has all the uniforms setup.
I assume that all the lights are point lights except the first light which I assume is either directional or point, I assume that my skins will have normals, but not vertex colours, I assume lighting is done per vertex, and that I always want specular lighting. It's the assumptions and limitations that make my game require 100s of shader permutations instead of thousands or millions. Which is fine for my little engine tailored around a couple of projects, but not a great solution for a larger endeavour.
In theory, I should avoid repeating lighting code in my skin shader and my terrain shader by having a set of utility functions which they can both call. In practise I haven't got around to that yet, and there's some copy-pasting currently going on.