I am seeing lately that some projects prefer for simple splatting
_mm_castsi128_ps(_mm_shuffle_epi32(_mm_castps_si128( v ), _MM_SHUFFLE(0,0,0,0)))
over
_mm_shuffle_ps( v, v, _MM_SHUFFLE(0,0,0,0) ). I
Is there any reason why one should prefer this variant?
SIMD Shuffle
There is only one reason: The author of the code has read through some timing tables, and noticed that an integer MMX shuffle is half the speed of a float SSE shuffle, and has therefore (incorrectly) assumed that using shuffle_epi32 is twice as fast as shuffle_ps.
Of course, I'm probably now going to have egg on my face as someone describes some quirky member of the 0x86 family where that is indeed true..... (although I highly doubt that's going to happen)
Of course, I'm probably now going to have egg on my face as someone describes some quirky member of the 0x86 family where that is indeed true..... (although I highly doubt that's going to happen)
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement