I'm currently working on a demo-app for RTW shadows, Maybe my experience is helpful, even if I have only implemented backwards projection so far and also still have to fix a few things and probably add some VSM variant on top.
From what i've gathered so far, using the forward method, a regular shadow map is colored brighter for regions of interest and darker for uninterested regions. I am not sure if the depth channel is colored or another channel. Then this colored shadow map is collapsed to 1D textures. I think all pixels in the same row is summed up to be one pixel in the 1D map, or the brightest color is picked. Then the same for all the pixels on the same column, which is placed in a second 1D texture. These 1D textures are blurred and then they are used to construct a warping map. How this is done, I don't know. Then this warping map is used to render the shadow map once again, this time with regions of interest magnified. And there is the final shadow map. Use this together with the warping map to transform each rendered pixel in view space to the shadow map and check if the depth is greater or not.
The coloring is only used for easier visualization in the paper, you just need a standard low-res shadowmap that you can use to detect depth-discontinuities, which then get assigned higher importance values. This is of course no problem in a small scene and for a single light, but I'm worried about the performance hit in a real scene and with potentially more lights.
The process is basically shadow map->2D importance map->2x 1D importance maps->blur->2x 1D warping maps. The warping maps are then used to to render the full-res shadow map and during final rendering.
The consuming part is to do CPU analysis on the original shadow map and color it, I think. Then maybe (I am guessing) it is possible to collapse the result using the GPU. By rendering the colored shadow map to a render target with 1px height for the columns, and then again to another render target with 1px width for the rows. Gaussian blur on these 1D render targets might also be done I guess. Now, render the warping map(is this a texture?) by sampling each of the 1D textures once for the corresponding pixel. I'm not sure how to use the warping map to mess around in the vertex and pixel shaders....
There is no CPU analysis necessary, everything can be done in pixel shaders on the GPU. Also with square SMs the 1D maps can be combined into a 2 pixel wide texture.
A sample implementation of the whole thing is available on Paul Rosen's homepage, in the software section. Very helpful.
For the criteria used in the paper, the backwards projection also needs a view-space depthbuffer and normals, so you'd want to use these for your general lighting as well.
Then there's the tessellation thing. It's only mentioned as an afterthought or optional thing in the paper, but for good results you'll need to ensure decently tessellated geometry, either using HW tessellation or during content creation.
Another thing I still have to work out is that shadow-acne actually increased in some areas using RTW, but this should be easy to fix by scaling the z-bias depending on the local shadow resolution / importance values.
Also some WIP pictures:
Projective-Aliasing still causing self-shadowing artifacts, even with hand-tuned bias values, so these still need some help.
Artifacts on an untessellated cube:
Fixed:
You can probably use RTW. It's one of the best shadow mapping algorithms.
Got any real-world experience with RTW you want to share? :)