Most often HDR sky textures don't contain sun, as it's a separate analytic light source and you don't want to apply it twice to the scene. Therefore you usually don't have to deal with such extreme values and ~32-64 samples are enough for diffuse pre-integration using importance sampling. It's also pretty cheap, as 6x8x8 destination pixels are enough for low frequency data like diffuse. This of course doesn't yield perfect results, but errors are hard to notice in a real scene with textures.
If you are interested in diffuse pre-integration using SH, then check out this great post by Sébastien Lagarde (it also contains an example application with source code):