SSE2 Horizontal Additions.

TerrorFLOP    100
Help!!! I''m in the process of writing a highly optimized math engine. I''m working on an inlined Scalar Product function, which on my comp, executes at around 0.0017 microseconds (is that good??!!). However, i''m trying to locate the Intel intrinsic _mm_hadd_ps function... In the hope I can shave of some more cycles. Where the *%$^ is it!!!??? I''ve already scanned the xmmintrin.h emmintrin.h include files... But to no avail... :-(( Much appreciate if someone could help!!! Thanks.

