I'm giving up. Not because of a technical difficuly, but because there is not enough current-generation video cards able to do it.
ATI cards, for example. Yes, you can find nice HDRI demos running on ATI cards... but they are just simple demos. In particular, my engine uses per-pixel lighting/shadowing, and requires additive multiple passes (once per light). Correct HDRI requires floating point buffers. But ATI cards do not support both blending AND floating point buffers.
I guess the R520 will support it, but i'm not gonna loose my sleep now for something that will "potentially" appear in 6 months-1 year (depending on more or less good driver support).
So, at the moment i will try to support a fake HDRI version. My plan is the following one:
1. Some textures will be encoded as RGBE, and decoded to floating point in a pixel shader.
2. Some constants (like the light colors) will be passed unclamped to the pixel shader.
3. The pixel shader will perform the per-pixel operations and scale the final color depending on the camera exposure. The result is then clamped into [0-1] (so i'm forgetting tone-mapping here :() and written to a standard fixed point RGBA buffer.
4. Repeat this process with additive blending for all the remaining passes (note: at each step the result will be clamped to [0-1] which is not good.. but no choice).
5. Once all the passes are rendered, get the HDRI buffer, keep any pixel with R, G or B equal to 1, and set to black all the other ones (this will avoid to have bloom on the whole image, which is ugly in my opinion).
6. Blur the buffer, add it back to the color buffer.
The thing that worries me the most is at step 5, because everything is clamped to [0-1], some colors which are logically different will end up with the same bloom color. But i'll see if the results are "good enough".