Sign in to follow this  

packed multiplication of floats on 8087?

This topic is 4834 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I came across a situation that would benefit incredibly by being able to do a packed mul of floats, similar to pmul on ints in MMX. Problem is my target platform only has MMX, no SIMD or SSE. So, is there any instruction on the standard 8087 that would allow me to do packed multiplication of floats? What I will do is split up a value in it's three component bytes, and multiply each byte by the same floating point value. (I'm doing bilinear interpolation) I will have a 32 bit register looking like this: XXXXXXXX.00000000.11111111.22222222 X is dummy, 0,1 and 2 are the separate bytes. I also have a floating point value between 0.0 and 1.0. I can under no circumstances have overflow, so the instruction needs to work on bytes. Is there any thing at all like this in the standard 8086 instructions? Or is there a trick of some kind to achieve this? I've been looking around the net for a while but can't seem to find anything on the subject. I'd apreciate any help on the topic.

Share this post


Link to post
Share on other sites
Quote:
Original post by Bad Maniac
I came across a situation that would benefit incredibly by being able to do a packed mul of floats, similar to pmul on ints in MMX. Problem is my target platform only has MMX, no SIMD or SSE.
Eh? So what's the problem?

And why do you refer to the 808[67] instruction set? That thing was 8-bit, and nobody in their right mind would actually have power going through one of those anymore.

Share this post


Link to post
Share on other sites
PMUL only works on INT's, I want to multiply FLOATS, you might wanna read the title...

Also you are thinking of an 8088 wich was 8 bit, the 8087 is the math co-processor, wich is still sitting in P4 and Athlons (wich are 80x86 processors), and can do floating point operations all the way up to 128 bit on the latest chip. But that's on one value per register at a time, I'd like to do a packed FMUL, wich is what I'm asking for.

Share this post


Link to post
Share on other sites
The OP wants this... Take a 32-bit word representing 4 8-bit unsigned bytes, A B C D, a 32-bit floating-point value, F, and produce a 32-bit word containing A' B' C' D', such that A' = F*A, B' = F*B...

If such an instruction exists... cool. I'd think it should, but I know you don't need that specific instruction. I've seen bilinear interpolation done full-screen on about 500 perspective-correct on-screen polygons at pretty good speed using MMX acceleration (source; Tomb Raider 2, software rendering mode, pentium II 350 MHz)

Share this post


Link to post
Share on other sites
Well, that explicit instruction doesn't exist in MMX. MMX does only integer-integer arithmetic. I'd also highly doubt there is such an instruction in the SSE or SSE2 packages, but I'd recommend against it anyway; it takes quite some time to switch in to and out of SSE 'mode', which can murder your performance if you can't do all your processing in large bursts.

Share this post


Link to post
Share on other sites

This topic is 4834 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this