Auto clamping 'char'?

Started by
8 comments, last by Sander 20 years, 4 months ago
I have this little test code for my projects math lib:

ubyte u;

u = 254;
cout << int(u) << "\n";

u++;
cout << int(u) << "\n";

u++;
cout << int(u) << "\n";

u++;
cout << int(u) << "\n";
ubyte is just typedef''d as unsigned char. The output is ofcourse "254", "255", "0", "1". However, i''d like to have it output "254", "255", "255", "255" instead. I want clamping. Can this easily be done? Or will I need if() statements? Sander Maréchal [Lone Wolves Game Development][RoboBlast][Articles][GD Emporium][Webdesign][E-mail]

<hr />
Sander Marechal<small>[Lone Wolves][Hearts for GNOME][E-mail][Forum FAQ]</small>

Advertisement
You need to use if statements
How about this?
class ubyte{  explicit ubyte(unsigned char v) : v_(v) {}  ubyte(int v)   (     if(v > 255)       v_ = 255;    else if(v < 0)      v_ = 0;    else      v_ = v;  }  ubyte() : v_(0) {}  ubyte& operator=(unsigned char v)  {    v_ = v;  }  ubyte& operator=(int v)  (     if(v > 255)       v_ = 255;    else if(v < 0)      v_ = 0;    else      v_ = v;  }  ubyte& operator++()  {    if(v_ = 255)      return *this;    ++v_;    return *this;  }  ubyte& operator++(int)  {    ubyte t = *this;    ++(*this);    return t;  }  //same thing for decrement...  //so we can use it like an unsigned char  unsigned char operator unsigned char() const  { return v_; }private:  unsigned char v_;};
There''s an MMX instruction that''ll do what you want: check out paddusb.
E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3
Thanks all

Sander Maréchal
[Lone Wolves Game Development][RoboBlast][Articles][GD Emporium][Webdesign][E-mail]

<hr />
Sander Marechal<small>[Lone Wolves][Hearts for GNOME][E-mail][Forum FAQ]</small>

Conditions and chars ... your CPU won''t like it.

Sander it''s possible to simulate MMX stuff with classical integer arithmetic and logical ops without any if(). I did it in the past ... before the MMX existed. I remember crazy asm code that could for instance compute the absolute manhattan distance between two RGB pixels. 9 cycles if I remember well. This was useful to speedup 16bits -> 256 colors textures.

Now here is how you can simulate RGB addition a=b+c with clamping. The trick is you need to detect carry propagation.

uint32 a,b=?,c=?;
uint32 d;

d = b ^ c; // xor, bitwise addition
a = b + c; // addition, carries propagate.
d^= a; // detect differences, that is where carry bits have been added
d&=0x01010100UL; // keep only the carry bits at position 8,16,24
a -=d; // Substract the carries that propagated between components.
e = d>>7UL; // shift so that 256 becomes 1
d-=e; // ex : 0x01000100L gives the mask 0x00FF00FFL
a!=d; // Mask the components which is actually unsigned clamping.

There are many other possible ways to do it ... but equivalently boring. You can greatly speed up things if one of the operands is known to have only components between 0 and 127, that is bit 7,15,23 unset.

Note that to handle RGBA you need "33 bits". In fact you would need the a sbb ecx,ecx to get the last carry bit. This can''t be done easilly in C. The asm code can only be efficient by processing at least two RGBA in //, thus like MMX. A rough count would tell me it''s around 5-8 cycles for 64 bits. Thus much slower than MMX ... of course but also much better than if(r>255) etc ...

So I am not sure it''s worth the challenge simulating such vector byte SIMD since any modern machine has MMX or the equivalent.
"Coding math tricks in asm is more fun than Java"
Well if I replied about it, though it''s deprecated it''s to show that particularilly today with MMX/3DNow or SSE/SSE2 it''s possible to create some very original and impressive computations for someone experienced with hacks that combine arithmetic/logical operations/floating point format hacks. What''s very interesting with SIMD is you can consider the float, byte or int representations of the same 64 or 128 bits without any or much time lost to convert or cast.

Cf my AABox/Plane computation with 3DNow Sander. It''s done in 9 instructions. AABox/Frustrum can thus be done in 6-30 cycles, roughly 100 million per sec, 1 million per frame at 100FPS

Note that an order of magnitude of 1 million collision checks per frame between AABoxes and Convex Hulls may give some ideas for collision detection ... Do you want one million cube particles colliding on the screen ? Use quasi linear time space partition, then ...
"Coding math tricks in asm is more fun than Java"
You can also do it with a oneliner like this

a = (a < 0) ? 0 : (a < 256) ? a : 255);
quote:Original post by Aldacron
You can also do it with a oneliner like this

a = (a < 0) ? 0 : (a < 256) ? a : 255);


I assume you mean

a = (a < 0) ? 0 : (a < 256) ? a + 1: 255);
--Riley
quote:Original post by rileyriley
I assume you mean

a = (a < 0) ? 0 : (a < 256) ? a + 1: 255);

Or even:
a = (a < 255) ? a + 1: 255;

It''s an unsigned char; it can''t be less than 0. And incrementing it when it''s equal to 255 will make it wrap.

This topic is closed to new replies.

Advertisement