Jump to content
  • Advertisement
Sign in to follow this  
polymorphed

SSE question

This topic is 3838 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm new to SSE so if this question is stupid, you know why. [smile] The question: Is there an instruction that moves a float into all four elements of a SIMD register? With my limited knowledge, the only way I can see is this:
float my_float = 3.14f;

float temporary[4] = {my_float, my_float, my_float, my_float};

__asm
{
    movups xmm0, temporary
}


Thanks for your time.

Share this post


Link to post
Share on other sites
Advertisement
You may want to give a look to intrinsics: the mm_load_ps1 function seems to do what you need. I'm not an expert, but I think that this intrinsic performs a "MOVSS + Shuffling" (whatever this means, aside from MOVSS being most probably a SSE instruction :-)


Hope this helps

Share this post


Link to post
Share on other sites
You're probably looking for the SSE intrinsic _mm_load_ps() (load packed single) which takes a vector of 4 floats and returns a __m128 value.

Intel provides the datatype __m128 for a vector of 4 floats

When I work with SSE, I usually do something like:


typedef union
{
float v[4];
__m128 r;
}vector4f;



vector4f v;

vector4f.r=_mm_load_ps(v.r);

//use SSE instrinsics here

_mm_store_ps(v.f,v.r);

Share this post


Link to post
Share on other sites
One of the advantages that NerdInHisShoe's approach will have is that the compiler will "align" the union on 128bit boundaries. This means you can use the SSE aligned read and write functions which are faster. You can acheive the same results yourself using __declspec(align(16)) in VS.

However while you can use the __m128 and float sections of the union interchangably you should note that the registers used are different and you will probably be causing the compiler to push data to memory and only read it back. I.e. Try and keep sections of your code that work on your floating point variables seperate to sections that use the __m128 variables.

Good luck!

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!