This article is from a year ago but it matches all my past observations and things I've been told: https://software.intel.com/en-us/blogs/2014/07/15/an-investigation-of-fast-real-time-gpu-based-image-blur-algorithms
The constant time compute shader blur is over twice as slow in a small kernel and still very slow until the kernel size is huge. Also consider that when you're writing to a UAV that represents a render target, you're either doing non-coherent writes because the destination texture is swizzled, or you're writing to linear memory in which case you probably need to re-swizzle it later to be useful.
Things could certainly be different now which would be great but my general rule has always been that unless I'm doing a whole lot of work with shared memory, the pixel shader's gonna be faster.
That's an apples to horseshoes comparison, though, unless I'm misreading - they're comparing the timings of two different techniques (both of which are running in compute from the sound of it) and comparing the timings of each. Yeah, the constant-time shader is slower, but it's solving a different problem. You wouldn't use that for a small kernel because of all the extra overhead involved.
There are other ways to make compute blur faster (like using groupshared memory which doesn't seem to be mentioned in that article, which can drastically reduce your memory accesses), but it doesn't say anything about comparing a 1:1 pixel shader:compute shader version of the same operations.
Also, how you decide to break your compute shader up can make a difference with performance (including cache coherency) - if your compute shader's thread groups load/store things in BLOCKS as opposed to in scanlines, you're likely to get better coherency group to group because you'll be writing into places that are related in memory, as I understand it.
I don't think that compute postprocessing is NECESSARILY slower than a pixel shader, but it does likely depend at least partly on how you've got it set up.