Archived

This topic is now archived and is closed to further replies.

jenova

can this be optimized further?

Recommended Posts

MIPS IV processor (specifically the PS2, as it does NOT support 64-bti floating point). converting a 64-bit floating point to 32-bit floating point in software. ignoring NAN, non-normalized, and mantissa over/underflow. do not worry about pipeline stalls, assembler reorders as necessary. any response would be greatly appreciated. entry - a0: 64-bit floating point value. return - v0: 32-bit floating point value.
    
  beq      a0, zero, 31f
  addu     v0, zero, zero
  // convert the 11-bit mantissa to 8-bit.

  dsll     t0, a0, 1
  dsrl32   t0, t0, 21
  addiu    t0, t0, -896
  andi     t0, t0, 0xff
  sll      t0, t0, 23
  // truncate the 53-bit fixed point value.

  dsrl     t1, a0, 29
  li       t2, 0x7fffff
  and      t1, t1, t2
  // retrieve the sign-bit.

  dsrl32   a0, a0, 0
  li       t3, 0x80000000
  and      v0, a0, t3
  // compound the 32-bit floating point value.

  or       v0, v0, t0
  or       v0, v0, t1
31:
  // return.

  jr       ra
  nop
    
Edit: dsrl32 t0, t0, 53 -> dsrl32 t0, t0, 21. gets truncated anyway, but still wrong. To the vast majority of mankind, nothing is more agreeable than to escape the need for mental exertion... To most people, nothing is more troublesome than the effort of thinking. Edited by - jenova on February 1, 2002 9:19:03 AM

Share this post


Link to post
Share on other sites
I don''t know MIPS that well, i just downloaded a MIPS64 reference off the web-site. I wanted to try to cut a couple of cycles off your code but I have one question, is the following correct?

addiu t0, t0, -896

I follow what needs to be done and follow what you are trying to do, except for that one line. Aren''t u suppossed to subtract 1023 to remove the double precision exponent bias, and clamp it to an 8 bit value first then add 127 as your new exponent bias? Or is what u did a bit manipulation trick that i''m not seeing?

-potential energy is easily made kinetic-

Share this post


Link to post
Share on other sites
yeah, the "andi t0, t0, 0xff" does this. i''m totally ignoring mantissa over/underflow because this would further degrade performance w/o much in return. so if the 64-bit float mantissa is outside the range (1151, 896) this becomes invalid.

the "addiu t0, t0, -896" is the sub "1023" and add "127" in one step.

now i could use min/max PS2 specific instruction but this potentially adds 3 instructions to this function.

To the vast majority of mankind, nothing is more agreeable than to escape the need for mental exertion... To most people, nothing is more troublesome than the effort of thinking.

Share this post


Link to post
Share on other sites
Jenova, does your assembly code work (the one thats posted here)? The code I wrote was based off yours, I won't even post it if yours didn't work. Do you know where i can get docs on the ISA extensions introduced to the MIPS arch in the EE? Exactly which of the MIPS64 instructions were excluded, and the 100+ new instructions... I had started writing another version based on some IEEE754 docs i have, and i think i can make it run as fast as your code but support NANs, +/-infinity and round. The only thing is I need to know the following:
1.Does EE support the CLO/CLZ/DCLO/DCLZ instructions and how many cycles do they take to execute?
2.Does EE support MOVN/MOVZ and Doubleword(64bit) versions of those instructions?

Also here is a link u might find useful: http://http.cs.berkeley.edu/~jhauser/arithmetic/softfloat.html

EDIT - I almost forgot... is the Processor setup to big or little endian mode?

-potential energy is easily made kinetic-

Edited by - Infinisearch on February 11, 2002 3:35:11 AM

Share this post


Link to post
Share on other sites