shl, bitshifting by 32 bits, does nothing. nothing!

Started by
7 comments, last by Cornstalks 11 years, 4 months ago

003BDCFB  mov         eax,dword ptr [outword] 
003BDCFE  mov         ecx,dword ptr [out32] 
003BDD01  mov         edx,dword ptr [ecx+eax*4] 
003BDD04  mov         ecx,dword ptr [outarea] 
003BDD07  shl         edx,cl 

slightly bit puzzled, I am bitshifting by 32 bits, which effectively would clear the variable (edx) to 0x00000000, but shl actually does nothing at all. edx remains unchanged (EDX = 0xfdfdfdfd).

Just double chekcing that's the expected behaviour, or am I reading this wrong? Gonna have to change my logic :(

Everything is better with Metal.

Advertisement

I'm pretty sure that x86 processors will bitwise AND the bitcount by 31 before shifting; thus, shifting by 32 shifts by 0. To clear the value if the shifted bits is higher than 31, you'll either need some branching, some bit tricks, or try to shift twice and hope the compiler doesn't optimize it into one, unless someone else has another solution.

try shifting right instead of left.

I can't always do that (it's dependent on endianness), but yeah looks like shifting by 32 isn't defined behaviour. I've got around it using a 'special condition' which I hate. Here's the whole function, I need to work on the logic and try to remove the condition.


// ------------------------------------------------------
// copy memory bit stream from source to destination.
// ------------------------------------------------------
// u8* out : destination begining of stream.
// u32 outptr : bit pointer into the destination bitstream.
// u32 outlimit : destination container size in bits.
// const u8* src : source begining of stream.
// u32 srcptr : bit pointer into the source bitstream.
// u32 srclen : number of bits to copy from source to destination.
// return bool : success or failure (overflow).
//
// source bits :
//   byte 1   byte 2   byte 3   byte 4   byte 5
// +--------+--------+--------+--------+--------+
// |........|......DE|FGHIJKLM|NOPQRST.|........|
// +--------+--------+--------+--------+--------+
//
// destination bits :
//   byte 1   byte 2   byte 3   byte 4   byte 5
// +--------+--------+--------+--------+--------+
// |ABC.....|........|........|........|........|
// +--------+--------+--------+--------+--------+
//
// result :
//   byte 1   byte 2   byte 3   byte 4   byte 5
// +--------+--------+--------+--------+--------+
// |ABCDEFGH|IJKLMNOP|QRST....|........|........|
// +--------+--------+--------+--------+--------+
// ------------------------------------------------------
bool bit_copy_32(void* out, u32 outptr, u32 outlimit, const void* src, u32 srcptr, u32 srclen)
{
	NE_ASSERT(((u32)out & 3) == 0);
	NE_ASSERT(((u32)src & 3) == 0);
	NE_ASSERT((outlimit & 3) == 0);
	
	const u32* src32 = (const u32*) src;
	u32* out32 = (u32*) out;

	if(outptr + srclen > outlimit)
		return false;

	// word-aligned copy. Use faster word / word copy mechanism.
	if(	(srcptr & 31) == 0 && (outptr & 31) == 0)
	{
		u32 srcword = (srcptr >> 5);							// srouce word address.
		u32 outword = (outptr >> 5);							// destination word address.
		u32 words	= (srclen >> 5);							// number of words to copy.
		memcpy(out32 + outword, src32 + srcword, words * 4);	// copy words.

		// move to the end of the words to copy remaining bits.
		u32 bits = (words  << 5);
		srclen -= bits;
		outptr += bits;
		srcptr += bits;
	}

	// bit copy.
	while(srclen > 0)
	{
		// extract portions of words of similar size from the source.
		u32 srcword = (srcptr >> 5); // source word address.
		u32 outword = (outptr >> 5); // destination word address.
		u32 srcbitp = (srcptr & 31); // source bit address.
		u32 outbitp = (outptr & 31); // destination bit address.
		u32 outarea = (32 - outbitp); // number of bits we need to clear at the destination to override with source bits.
		
		// copy bits from source word to destination word.
		#if(ENDIAN_ORDER == ENDIAN_LITTLE)
		{
			if(outarea == 32)
			{
				out32[outword] = ((src32[srcword] >> srcbitp) << outbitp);	// paste bits from source word.
			}
			else
			{
				out32[outword]  = ((out32[outword] << outarea) >> outarea);	// clear area in destination word.
				out32[outword] |= ((src32[srcword] >> srcbitp) << outbitp);	// paste bits from source word.
			}
		}
		#elif(ENDIAN_ORDER == ENDIAN_BIG)
		{
			if(outarea == 32)
			{
				out32[outword] = ((src32[srcword] << srcbitp) >> outbitp);	// paste bits from source word.
			}
			else
			{
				out32[outword]  = ((out32[outword] >> outarea) << outarea);	// clear area in destination word.
				out32[outword] |= ((src32[srcword] << srcbitp) >> outbitp);	// paste bits from source word.
			}
		}
		#else
		{
			NE_ERROR("unknown endianness");
			return false;
		}
		#endif

		// how many bits we copied from source to destination.
		u32 srcarea = (32 - srcbitp);					// number of bits we copied from source.
		u32 cpycount = min3(srcarea, outarea, srclen);	// smallest portion we copied.

		// move to next bits in streams.
		srclen -= cpycount;
		srcptr += cpycount;
		outptr += cpycount;
	}
	return true;
}

Everything is better with Metal.

In C and C++, shifting an N-bit number by N or more bits (so a 32-bit number by 32 or more bits) is undefined behavior (thus, shifting by 32 can do anything, from crash to having no effect) (if you really need to shift by 32, you can shift by 16, and then again by 16). In x86 assembly, there are only 5 physical wires used when doing the shift, so you can say the number is "AND-ed with 31," and only the low 5 bits of the number are used when shifting (thus, shifting by 32 is really shifting by (32 & 31) which is 0).

[size=2][ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]
In C and C++, shifting an N-bit number by N or more bits (so a 32-bit number by 32 or more bits) is undefined behavior....
That's not quite correct. With a shift expression the operands undergo integral promotions. The shift is undefined if the value of the right operand is greater than or equal to the number of bits in the promoted left operand (or negative). So shifting an unsigned 8-bit type by 8 bits will always be defined because integral promotion will always get it up to at least 16 bits before the shift is performed.

If your compiler supports a 64-bit type, then you might be able to avoid the special case code by casting one or more of your operands to 64-bits before doing the shifts.
Shifts the bits in the first operand (destination operand) to the left or right by the number of bits specified in the second operand (count operand). Bits shifted beyond the destination operand boundary are first shifted into the CF flag, then discarded. At the end of the shift operation, the CF flag contains the last bit shifted out of the destination operand.

The destination operand can be a register or a memory location. The count operand can be an immediate value or register CL. The count is masked to 5 bits, which limits the count range to 0 to 31. A special opcode encoding is provided for a count of 1.

Exactly as specified by the x86 architecture.

Cool, thanks for the input :)

I might give 64-bit conversion a go. Looks like the endianness format is conserved (at least big / little).

Everything is better with Metal.

Actually, I'm a bit retarded.

instead of using a mask, for example

mask = (0xffffffff >> (32 - bitcount));

I can use

mask = ~(0xffffffff << bitcount);

D'uuurh.

Everything is better with Metal.

In C and C++, shifting an N-bit number by N or more bits (so a 32-bit number by 32 or more bits) is undefined behavior....
That's not quite correct. With a shift expression the operands undergo integral promotions. The shift is undefined if the value of the right operand is greater than or equal to the number of bits in the promoted left operand (or negative). So shifting an unsigned 8-bit type by 8 bits will always be defined because integral promotion will always get it up to at least 16 bits before the shift is performed.

I guess it depends on how you read what I wrote. I intended "N-bit number" to be the actual number/type that gets shifted (that is, the promoted type). But it's worth clarifying, I suppose.

[size=2][ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]

This topic is closed to new replies.

Advertisement