You are talking about OGL/GLSL, right ? Then try to support at least the minimum requirements of your choosen OGL/GLSL version. You can find the supported features (e.g. number of indirect texture access) in the specification.
# What's the best practice for making shaders cross-platform? At the moment I'm developing on Mac only, so hard to know.
texture => bandwidth
# So far my shaders have had no significant effect on frame-rate. What tends to have the most effect? Textures, number of shaders, number of instructions per shader, branching?
number of shaders => not that important, as long as you don't switch the shader all the time, try to bundle API calls by material/shader.
number of instructions => very important for pixel shaders, less is always better
branching => a certain measure of branching will not hurt, it can even improve performance if you discard expensive calculations which will effect larger parts of the screen (shaders are executed in groups (tiles), the slowest shader in this group will slow down the rest. But you will benefit from it if all performe in the same way most of the time).
Use pre-processor statements to support multiple versions.
# Do I need to offer cut-down versions for lower spec'd hardware, and if so how does it know which version to use?
Yes. See above (shader groups).
# Is it worth branching if most of the time a huge amount of work can be skipped, e.g. only do bump-mapping within distance x of the camera?
It depends on the overhead, texture access is not always cheap. But it could pay off if you have a small texture (cache friendly) and keep your shader code clean of addtional branches.
# When blending bump-mapped textures would it pay off to waste texture space by using a completely flat normal texture rather than having a special case for textures with no bump-map?
Depends on the supported OGL/GLSL version and hardware. Most modern hardware will have not have an instruction limitid any longer, though there are other limitations (registers) and many instruction will reduce performance.
# What is the instruction count limit based on? Some shaders I've seen seem very "busy" but don't exceed the limits.