There is (almost) nothing wrong with xoring an existing file. The only possible issue is that your application crashes and leaves a half-encoded file behind, but there is admittedly not much that could crash in a simple for loop (note that I'm intentionally ignoring the possibility of a power failure or the user flipping the Big Switch, because these are things that you can never fully protect yourself against, even when explicitly syncing you have no guarantee that all writes are really on permanent storage).
Probably I'll try to implement it next to see what the performance difference is like. Your example gives good execution times but it's simply xor'ing an existing (or newly created) file. I'd have to copy the file from the original to make that work with my imp. I'll have to fiddle around with how to get the best result from it. (copy first then mod vs read-to-map then mod, etc)
If you still feel better copying the file, the easiest way to do this is to create a new file and a mapping with the desired size. The documentation on MSDN is somewhat misleading as it will make you believe that SetFileValidData is the most efficient way to allocate a new file, preventing stalls while new clusters need to be allocated, and reducing fragmentation at the same time. Not only does this function require non-standard privilegues because of security implications, but it is also entirely unnecessary.
Creating a mapping that is larger than the underlying file will instantly grow the file to that size (or fail, if there's not enough space) and satisfy all reads to that mapping with pages from the zero pool while creating pages that will later be flushed to disk for any writes that happen. No need to do anything more complicated than creating an empty second file, creating a mapping the same size as the first file, and doing something like a memcpy-with-xor from the first to the second (something like *two = (*one ^ xor_val); ++one; ++two; ). Of course this touches twice as many memory pages as doing the work directly on the original, and it increases CPU cache pressure and may have associativity effects if you're very unlucky, so it will generally be somewhat slower (though, probably the difference won't be very noticeable in comparison to disk I/O).
Edited by samoth, 15 November 2012 - 03:57 AM.