Videos are even better. Quite some nice GPGPU samples you got there. Congrats.
I just wanted to hear if you had any troubles with compute shaders at all. Because I had half a year ago, so be warned: The compiler sometimes took minutes or crashed. If it compiled, the shader sometimes produced silly results. Then again, I probably did something blatantly wrong design-wise and also still used the June 2010 SDK compiler (which also has troubles with tesselation). If you go compute shader make sure to use the newest one (coming with the Windows 8 Kit).
I was so fed up to give OpenCL a shot. Played with Cloo (I'm using C#) and was positively surprised. Compilation took seconds at most (subsequent compilation even seems to be cached by the NVidia driver), and the results were fine.
Edit: That performance number of your particles is quite impressive. Smells like the interop isn't that bad a staller. Or do you have such a beefy hardware ?
. If you do transliterate that sample to a compute shader, don't forget to post a comparison, please.