    May be a silly suggestion, but could you write an XML schema for the sanitizer, then just validate the XHTML against that? Of course, this implies that you're happy with the sanitizer stage doing: invalid content -> error message, rather than: invalid content -> valid content.
