It's also using three xors to swap two variables, instead of using std::swap(), which is much more readable (and probably faster too).
(for pointers and integer types, the performance of three xors would be the same as std::swap on most processors. the three xors method avoids a temporary and may perform marginally better in some cases)
True.
At least, true back in the 1980s and earlier.
These days, for the past 25 years or so, it is a very false.
Pipelined CPUs began to make it untrue for desktop computers about 1987 or 1989 or so. A CPU pipeline means that when something is the destination of an operation in progress, new processing needs to stop until the result is assigned and retired back. The pattern "A XOR B, B XOR A, A XOR B" worked fine on non-pipelined processors, but is the direct worst case for adding pipeline bubbles since the second must wait for the first to complete and be retired before it can complete decoding of the second, and the third must also wait for the second to complete and be retired before the third can finish decoding. It is the worst case operation for a pipelined processor.
By the time 1995 and out-of-order processing came around, with a then-astonishingly-huge processing pipeline, the xor-swap was a bad enough problem that Intel had to write code for the Pentium Pro (the flagship at the time) to detect the pattern and rewrite it to use the temporary. When AMD followed suit they also detected it and made it a special case.
If you are lucky enough to be on a "big" processor like an x86, if your compiler doesn't recognize it and fix it for you the x86 chipset will likely detect it and replace it with code that is at most no worse than a swap with a temporary. So yes it performs well on these chips, but it does so because the systems know the stupid pattern destroys performance and fixes it by invisibly using a temporary if one is available.
If you are unlucky and are on a "little" processor, maybe a low-powered arm chip or something embedded, if your compiler doesn't recognize it you've just added the worst case for a pipeline stall.
The XOR Swap took advantage of a shortage of registers that made temporaries expensive, the fact that touching memory (even something residing in the precious few bytes of cache) caused a relatively long stall, and the results were available instantly as there was no instruction pipeline. These days CPU registers are more plentiful, on-die memory and cache hits are as fast as the clock, and pipelines are deep. It is the opposite of what made the XOR Swap fast.
Another of those fun little tidbits was the "Duff's Device", basically using a switch statement as a way to reverse-unroll a loop, letting you jump in to the middle and finish out the job at any point. The jump table of a switch statement generates small and fast code ... up until CPU branch prediction tables were introduced and made sufficiently large. The branch misprediction penalty now typically exceeds the benefit the structure saw from the introduction of a jump table. Now a Duff's Device is an optimizing compiler's nightmare. The pattern still functionally works but it has known-slow performance. Optimizing compilers recognize the pattern, but the difficult choice is to either leave it in and destroy performance with mispredictions, or take out the jump table and thrash the cache with a large number of processing options, more data and instructions. Better compilers evaluate both of the bad options and make an educated guess at which is less bad. Several newer languages expressly forbid the Duff's Device pattern, forcing you to have a break at the end of every case segment.
A problem with many of these little performance tidbits (and there are many) is that the reasoning for them is frequently lost. When the reason behind them vanishes due to hardware evolution, their benefits go away or even become a performance penalty as the XOR swap exemplifies. Even though more than two decades have passed since it was useful, people still call up that outdated 'trick'. Many of the now-wrong gems are slow to die, so please double check that they are still valid, and explain the reasoning for the performance benefits when they are still valid.