Setting stride length with SSE intrinsics

Started by
1 comment, last by manfred_p 14 years ago
Hi there. Is there any way to set the stride length when writing data from an XMM register with SSE intrinsics? I have 16 8bit values stored in an XMM register which are being written to a char array. I would like each 8bit value in the XMM register to be written to 16 byte strides (i.e. so that each value written to the char array is seperated by 15 bytes). The only way I can think of doing it is serially - iterating through each value in the XMM register and storing it accordingly in the char array. However, it kinda defeats the object of me struggling with SIMD and working in parallel! Or am I asking for the impossible? Thanks
Advertisement
You are probably asking the impossible. As I don't know of such an instruction.
But the better question is why would you be doing that in the first place?
To get whole benefit of SSE, all your data needs to be packed into the format SSE needs. Having it is some other format quickly defeats any gains you can make from the SSE math functions. Even having to shuffle data with the shuffle instructions can waste all your time, if you aren't doing enough math along side it.

I'ma going to link this as it is a good read on what is optimal in SSE.

Thanks for the reply and link. It's for an int to ASCII converter function that sorts eight numbers ata time from a larger array. I'm trying to write each resulting ASCII value to a char array so that I can output to a text file so the only SIMD part of my program is the ASCII conversion function.

This topic is closed to new replies.

Advertisement