/* first, perform an endian switch on the buffer */
/* setup this byte */
char tempByte = 0;
char thisByte = Buffer;
tempByte |= (thisByte & 0x80) >> 7;
tempByte |= (thisByte & 0x40) >> 5;
tempByte |= (thisByte & 0x20) >> 3;
tempByte |= (thisByte & 0x10) >> 1;
tempByte |= (thisByte & 0x08) << 1;
tempByte |= (thisByte & 0x04) << 3;
tempByte |= (thisByte & 0x02) << 5;
tempByte |= (thisByte & 0x01) << 7;
/* Store the temp byte at the end of the temp buffer */
Transposing bits in a stream
Just a theoretical question here. The test I was given (don't worry, the test is long over with) was to exchange the first bit in a buffer with the last bit, the next bit with the next to last bit, and so on. It had to be written in C.
Basically, switching endian is pretty easy, but switching the bits is something a little more problematic (as far as I know).
I can up with this algorithm, but tell me if I'm missing something:
Anything more sly out there?
Couldn't you just XOR the correct bits pairwise and if the XOR returns true, flip the bits, and leave them alone if the XOR test returns false?
My initial thought is no. But maybe I'm not understanding what you mean. My brain is starting to get mushy. Doesn't that just get you your original byte back?!?
If you have two bits, bit1 and bit2, and you wish to exchange them you could (then generalize to more bit pairs):
since
x y XOR
0 0 0
0 1 1
1 0 1
1 1 0
Maybe I'm not understanding you; I was just throwing this out there.
edit: XOR Swap Algorithm?
Lemme think some more, I hear you.
[Edited by - signal_ on August 13, 2009 11:00:54 PM]
if( bit1 XOR bit2 == true ){ bit1 = !bit1 // since the XOR op is true, flip the bits bit2 = !bit2 // if the XOR is false, do nothing.}
since
x y XOR
0 0 0
0 1 1
1 0 1
1 1 0
Maybe I'm not understanding you; I was just throwing this out there.
edit: XOR Swap Algorithm?
Lemme think some more, I hear you.
[Edited by - signal_ on August 13, 2009 11:00:54 PM]
Ahh, OK. But I am talking about switching the bits in an entire byte. So, bit 7 of the byte becomes bit 0 (and visa verse).
EDIT: Keep in mind this must be done in C.
EDIT: Keep in mind this must be done in C.
Quote:Anything more sly out there?
If you could use the features of the C language to accomplish this task, then this is what I'd do.
#include <stdio.h>#include <stdlib.h>typedef struct{ unsigned char bit0 : 1; unsigned char bit1 : 1; unsigned char bit2 : 1; unsigned char bit3 : 1; unsigned char bit4 : 1; unsigned char bit5 : 1; unsigned char bit6 : 1; unsigned char bit7 : 1;} TBitStruct;int main(int argc, char * argv[]){ unsigned short ch; for(ch = 0; ch < 256; ++ch) { TBitStruct bitStruct1 = *(TBitStruct*)(&ch); TBitStruct bitStruct2 = *(TBitStruct*)(&ch); // XOR swap: http://en.wikipedia.org/wiki/XOR_swap_algorithm // You could use a regular swap, but this one is more fun. bitStruct2.bit0 ^= bitStruct2.bit7; bitStruct2.bit7 ^= bitStruct2.bit0; bitStruct2.bit0 ^= bitStruct2.bit7; bitStruct2.bit1 ^= bitStruct2.bit6; bitStruct2.bit6 ^= bitStruct2.bit1; bitStruct2.bit1 ^= bitStruct2.bit6; bitStruct2.bit2 ^= bitStruct2.bit5; bitStruct2.bit5 ^= bitStruct2.bit2; bitStruct2.bit2 ^= bitStruct2.bit5; bitStruct2.bit3 ^= bitStruct2.bit4; bitStruct2.bit4 ^= bitStruct2.bit3; bitStruct2.bit3 ^= bitStruct2.bit4; // Debug printf("%i -->\n", ch); printf("\t%.2X %.2X %.2X %.2X %.2X %.2X %.2X %.2X\n", bitStruct1.bit0, bitStruct1.bit1, bitStruct1.bit2, bitStruct1.bit3, bitStruct1.bit4, bitStruct1.bit5, bitStruct1.bit6, bitStruct1.bit7); printf("\t%.2X %.2X %.2X %.2X %.2X %.2X %.2X %.2X\n", bitStruct2.bit0, bitStruct2.bit1, bitStruct2.bit2, bitStruct2.bit3, bitStruct2.bit4, bitStruct2.bit5, bitStruct2.bit6, bitStruct2.bit7); printf("\n"); if((ch + 1) % 6 == 0) { system("pause"); system("cls"); } } return 0;}
First 6 numbers output:
0 --> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 001 --> 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 012 --> 00 01 00 00 00 00 00 00 00 00 00 00 00 00 01 003 --> 01 01 00 00 00 00 00 00 00 00 00 00 00 00 01 014 --> 00 00 01 00 00 00 00 00 00 00 00 00 00 01 00 005 --> 01 00 01 00 00 00 00 00 00 00 00 00 00 01 00 01Press any key to continue . . .
If you compiled that code and then looked at the generated assembly, you'd get the non-struct version that is similar in format to you code. I've not actually gone through and converted it into that format though, but as long as you understand assembly, it's pretty simple. This is the code I get in Visual Studio 2008:
; 23 : ; 24 : // XOR swap: http://en.wikipedia.org/wiki/XOR_swap_algorithm; 25 : ; 26 : bitStruct2.bit0 ^= bitStruct2.bit7; 0004d 8a 45 e3 mov al, BYTE PTR _bitStruct2$3721[ebp] 00050 24 01 and al, 1 00052 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 00055 c0 e9 07 shr cl, 7 00058 80 e1 01 and cl, 1 0005b 0f b6 d1 movzx edx, cl 0005e 0f b6 c0 movzx eax, al 00061 33 c2 xor eax, edx 00063 24 01 and al, 1 00065 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 00068 80 e1 fe and cl, 254 ; 000000feH 0006b 0a c8 or cl, al 0006d 88 4d e3 mov BYTE PTR _bitStruct2$3721[ebp], cl; 27 : bitStruct2.bit7 ^= bitStruct2.bit0; 00070 8a 45 e3 mov al, BYTE PTR _bitStruct2$3721[ebp] 00073 c0 e8 07 shr al, 7 00076 24 01 and al, 1 00078 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 0007b 80 e1 01 and cl, 1 0007e 0f b6 d1 movzx edx, cl 00081 0f b6 c0 movzx eax, al 00084 33 c2 xor eax, edx 00086 24 01 and al, 1 00088 c0 e0 07 shl al, 7 0008b 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 0008e 80 e1 7f and cl, 127 ; 0000007fH 00091 0a c8 or cl, al 00093 88 4d e3 mov BYTE PTR _bitStruct2$3721[ebp], cl; 28 : bitStruct2.bit0 ^= bitStruct2.bit7; 00096 8a 45 e3 mov al, BYTE PTR _bitStruct2$3721[ebp] 00099 24 01 and al, 1 0009b 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 0009e c0 e9 07 shr cl, 7 000a1 80 e1 01 and cl, 1 000a4 0f b6 d1 movzx edx, cl 000a7 0f b6 c0 movzx eax, al 000aa 33 c2 xor eax, edx 000ac 24 01 and al, 1 000ae 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 000b1 80 e1 fe and cl, 254 ; 000000feH 000b4 0a c8 or cl, al 000b6 88 4d e3 mov BYTE PTR _bitStruct2$3721[ebp], cl; 29 : ; 30 : bitStruct2.bit1 ^= bitStruct2.bit6; 000b9 8a 45 e3 mov al, BYTE PTR _bitStruct2$3721[ebp] 000bc d0 e8 shr al, 1 000be 24 01 and al, 1 000c0 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 000c3 c0 e9 06 shr cl, 6 000c6 80 e1 01 and cl, 1 000c9 0f b6 d1 movzx edx, cl 000cc 0f b6 c0 movzx eax, al 000cf 33 c2 xor eax, edx 000d1 24 01 and al, 1 000d3 d0 e0 shl al, 1 000d5 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 000d8 80 e1 fd and cl, 253 ; 000000fdH 000db 0a c8 or cl, al 000dd 88 4d e3 mov BYTE PTR _bitStruct2$3721[ebp], cl; 31 : bitStruct2.bit6 ^= bitStruct2.bit1; 000e0 8a 45 e3 mov al, BYTE PTR _bitStruct2$3721[ebp] 000e3 c0 e8 06 shr al, 6 000e6 24 01 and al, 1 000e8 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 000eb d0 e9 shr cl, 1 000ed 80 e1 01 and cl, 1 000f0 0f b6 d1 movzx edx, cl 000f3 0f b6 c0 movzx eax, al 000f6 33 c2 xor eax, edx 000f8 24 01 and al, 1 000fa c0 e0 06 shl al, 6 000fd 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 00100 80 e1 bf and cl, 191 ; 000000bfH 00103 0a c8 or cl, al 00105 88 4d e3 mov BYTE PTR _bitStruct2$3721[ebp], cl; 32 : bitStruct2.bit1 ^= bitStruct2.bit6; 00108 8a 45 e3 mov al, BYTE PTR _bitStruct2$3721[ebp] 0010b d0 e8 shr al, 1 0010d 24 01 and al, 1 0010f 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 00112 c0 e9 06 shr cl, 6 00115 80 e1 01 and cl, 1 00118 0f b6 d1 movzx edx, cl 0011b 0f b6 c0 movzx eax, al 0011e 33 c2 xor eax, edx 00120 24 01 and al, 1 00122 d0 e0 shl al, 1 00124 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 00127 80 e1 fd and cl, 253 ; 000000fdH 0012a 0a c8 or cl, al 0012c 88 4d e3 mov BYTE PTR _bitStruct2$3721[ebp], cl; 33 : ; 34 : bitStruct2.bit2 ^= bitStruct2.bit5; 0012f 8a 45 e3 mov al, BYTE PTR _bitStruct2$3721[ebp] 00132 c0 e8 02 shr al, 2 00135 24 01 and al, 1 00137 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 0013a c0 e9 05 shr cl, 5 0013d 80 e1 01 and cl, 1 00140 0f b6 d1 movzx edx, cl 00143 0f b6 c0 movzx eax, al 00146 33 c2 xor eax, edx 00148 24 01 and al, 1 0014a c0 e0 02 shl al, 2 0014d 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 00150 80 e1 fb and cl, 251 ; 000000fbH 00153 0a c8 or cl, al 00155 88 4d e3 mov BYTE PTR _bitStruct2$3721[ebp], cl; 35 : bitStruct2.bit5 ^= bitStruct2.bit2; 00158 8a 45 e3 mov al, BYTE PTR _bitStruct2$3721[ebp] 0015b c0 e8 05 shr al, 5 0015e 24 01 and al, 1 00160 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 00163 c0 e9 02 shr cl, 2 00166 80 e1 01 and cl, 1 00169 0f b6 d1 movzx edx, cl 0016c 0f b6 c0 movzx eax, al 0016f 33 c2 xor eax, edx 00171 24 01 and al, 1 00173 c0 e0 05 shl al, 5 00176 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 00179 80 e1 df and cl, 223 ; 000000dfH 0017c 0a c8 or cl, al 0017e 88 4d e3 mov BYTE PTR _bitStruct2$3721[ebp], cl; 36 : bitStruct2.bit2 ^= bitStruct2.bit5; 00181 8a 45 e3 mov al, BYTE PTR _bitStruct2$3721[ebp] 00184 c0 e8 02 shr al, 2 00187 24 01 and al, 1 00189 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 0018c c0 e9 05 shr cl, 5 0018f 80 e1 01 and cl, 1 00192 0f b6 d1 movzx edx, cl 00195 0f b6 c0 movzx eax, al 00198 33 c2 xor eax, edx 0019a 24 01 and al, 1 0019c c0 e0 02 shl al, 2 0019f 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 001a2 80 e1 fb and cl, 251 ; 000000fbH 001a5 0a c8 or cl, al 001a7 88 4d e3 mov BYTE PTR _bitStruct2$3721[ebp], cl; 37 : ; 38 : bitStruct2.bit3 ^= bitStruct2.bit4; 001aa 8a 45 e3 mov al, BYTE PTR _bitStruct2$3721[ebp] 001ad c0 e8 03 shr al, 3 001b0 24 01 and al, 1 001b2 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 001b5 c0 e9 04 shr cl, 4 001b8 80 e1 01 and cl, 1 001bb 0f b6 d1 movzx edx, cl 001be 0f b6 c0 movzx eax, al 001c1 33 c2 xor eax, edx 001c3 24 01 and al, 1 001c5 c0 e0 03 shl al, 3 001c8 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 001cb 80 e1 f7 and cl, 247 ; 000000f7H 001ce 0a c8 or cl, al 001d0 88 4d e3 mov BYTE PTR _bitStruct2$3721[ebp], cl; 39 : bitStruct2.bit4 ^= bitStruct2.bit3; 001d3 8a 45 e3 mov al, BYTE PTR _bitStruct2$3721[ebp] 001d6 c0 e8 04 shr al, 4 001d9 24 01 and al, 1 001db 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 001de c0 e9 03 shr cl, 3 001e1 80 e1 01 and cl, 1 001e4 0f b6 d1 movzx edx, cl 001e7 0f b6 c0 movzx eax, al 001ea 33 c2 xor eax, edx 001ec 24 01 and al, 1 001ee c0 e0 04 shl al, 4 001f1 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 001f4 80 e1 ef and cl, 239 ; 000000efH 001f7 0a c8 or cl, al 001f9 88 4d e3 mov BYTE PTR _bitStruct2$3721[ebp], cl; 40 : bitStruct2.bit3 ^= bitStruct2.bit4; 001fc 8a 45 e3 mov al, BYTE PTR _bitStruct2$3721[ebp] 001ff c0 e8 03 shr al, 3 00202 24 01 and al, 1 00204 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 00207 c0 e9 04 shr cl, 4 0020a 80 e1 01 and cl, 1 0020d 0f b6 d1 movzx edx, cl 00210 0f b6 c0 movzx eax, al 00213 33 c2 xor eax, edx 00215 24 01 and al, 1 00217 c0 e0 03 shl al, 3 0021a 8a 4d e3 mov cl, BYTE PTR _bitStruct2$3721[ebp] 0021d 80 e1 f7 and cl, 247 ; 000000f7H 00220 0a c8 or cl, al 00222 88 4d e3 mov BYTE PTR _bitStruct2$3721[ebp], cl
Hey! Thanks for posting the ASM on that. I've not gone to the extent to look at the ASM myself. Generally, these days I'm lazy and let the compiler do it for me :)
One quick question in my mind though: I was working under the assumption that unrolling any sort of loop would be optimal, but I see you are using a loop for known values. I specifically avoided loops because of the comparison operations, so do you know something I don't about speed.
If I get to it this weekend, I will post the ASM results from my algorithm.
EDIT: I just installed Win 7 a couple of weeks ago, so it may take some time to get VC2008 up and running again. As soon as I do, I will post the flat ASM code.
One quick question in my mind though: I was working under the assumption that unrolling any sort of loop would be optimal, but I see you are using a loop for known values. I specifically avoided loops because of the comparison operations, so do you know something I don't about speed.
If I get to it this weekend, I will post the ASM results from my algorithm.
EDIT: I just installed Win 7 a couple of weeks ago, so it may take some time to get VC2008 up and running again. As soon as I do, I will post the flat ASM code.
The looping from [0, 256) is only for demonstrating the method actually works for all 256 values. [wink] You can actually look through the results of all the bytes to validate the method, that's all.
In your own program, you'd just cast the current stream byte into a structure object and operate on it, then continue down the stream. Rather than create a 256 byte stream with all the values of 0 to 255, I just used a loop, there's nothing special about it. Likewise the cls and pause were only added for debugging, you'd not really want to use those either.
Generally speaking, unless you have profile results that show hand unrolling loops does a batter job than what the compiler does for you, I'd not even give it a thought. Even the most trivial loops, I'd leave as loops and not try to micro-optimize, it's just not really worth the time and side effects of harder to manage code unless you have very strong reasons to believe it's causing performance issues in your program (to which if that's really the case, then you have bigger issues on your hand!).
In your own program, you'd just cast the current stream byte into a structure object and operate on it, then continue down the stream. Rather than create a 256 byte stream with all the values of 0 to 255, I just used a loop, there's nothing special about it. Likewise the cls and pause were only added for debugging, you'd not really want to use those either.
Generally speaking, unless you have profile results that show hand unrolling loops does a batter job than what the compiler does for you, I'd not even give it a thought. Even the most trivial loops, I'd leave as loops and not try to micro-optimize, it's just not really worth the time and side effects of harder to manage code unless you have very strong reasons to believe it's causing performance issues in your program (to which if that's really the case, then you have bigger issues on your hand!).
So as I understand you want to reverse the bit sequence.
Here is some information that might be of help:
http://graphics.stanford.edu/~seander/bithacks.html#ReverseParallel
Here is some information that might be of help:
http://graphics.stanford.edu/~seander/bithacks.html#ReverseParallel
It's interesting, to say the least. Through all of my years of schooling the professors emphasized to not make assumptions on what the compiler would do. So we would do all these crazy unrolling schemes and whatever. However, once I got into the real world, almost everyone says, "meh, the compiler will take care of that".
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement