If it was properly optimized couldn't it be faster? I just hate the idea of rasterizing an entire scene for every patch. I was hoping some kind of clever face->face visibility determination could eliminate huge numbers of patches that needed to be checked against others, and to set it up in such a way that you could pretty much just throw a list of patches that needed to be radiated at a compute shader.
The original radiosity algorithm just worked that way, looking at all pairs. There are many papers on how to accelerate it. One advantage you get from the hemicube is that you immediately get the correct form factor (without the costheta), whereas in patch-to-patch interaction you need some analytical expression for it. There are a number of them, some based on contour integrals, and some based on simpler approximations (disks etc). Otoh you get aliasing artifacts when doing hemicube, so there's a tradeoff.