# padding two 8 bit color in 1 32 bit unsigned int for alpha blending, bad idea?

This topic is 4171 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Idea is come from doing SIMD on a non SIMD proccesser.... A Tradtional alpha blending calculation look like this: result_R = (Source_R * (256 - alpha) + Dest_R * alpha)) /256 result_G = (Source_G * (256 - alpha) + Dest_G * alpha)) /256 result_B = (Source_B * (256 - alpha) + Dest_B * alpha)) /256 Assum than all result in a 8 Bit alpha blending operation fall between 0-255 (Field mod 256? :p) Except the 256, others are 8 bit operation. For a 8 bit unsigned multiple operation, we will have a 16 bit integer and 1 overflow bit So Each step only use as most as 17 bit for calculation and 255(dec) * 255(dec) = FE01(HEX) in maxinum case Therefore 16 bit should enough for one for a bit alpha blending operation. Now I will pack 2 color into a 32 bit unsigned integer for calculation, New calculation should look like this: long int RR00GG00, 00SR00SG, 00DR00DG//assum long int is 32 bit byte alpha RR00GG00 = 00SR00SG * (256-alpha) + 00DR00DG * alpha Result_R = RR00GG00 shl 24 Result_G = RR00GG00 shl 8 and FF register(in x86 cpu)look of 00SR00SG: bit 31........0(EAX) 0 0 0 0 0 0 0 0 (upper 8 bit in first half of eax) src_R_bit[7] src_R_bit[6] src_R_bit[5] src_R_bit[4] src_R_bit[3] src_R_bit[2] src_R_bit[1] src_R_bit[0](lower 8 bit in first half of eax) 0 0 0 0 0 0 0 0 (AH) src_G_bit[7] src_G_bit[6] src_G_bit[5] src_G_bit[4] src_G_bit[3] src_G_bit[2] src_G_bit[1] src_G_bit[0] (AL) Where RR00GG00 the first 8 bit is the result of red color, the third 8 bit is result of green color. 00SR00SG,the SECOND 8 bit is the color of red in source last 8 bit is the color green red in source 00DR00DG,the SECOND 8 bit is the color of red in destination last 8 bit is the color green red in destination The bule color is the most annoying part... I will preccess it like this: T100T200 = 00SB00DB * alpha T1 = T100T200 shl 24 T2 = (T100T200 shl 8) and FF Result_B = T1 + (256-T2) T1,T2 are the the intermedate result of source color blue * alpha and result of destination color blue * alpha For all, the presudo code should look like this: var RR00GG00, 00SR00SG, 00DR00DG, T100T200 : integer;//assum integer is 32 bit alpha, T1, T2 : byte 00SR00SG := Source_R shr 16 + Source_G 00DR00DG := Dest_R shr 16 + Dest_G 00SB00DB := Source_B shr 16 + Dest_B RR00GG00 := 00SR00SG * (256-alpha) + 00DR00DG * alpha T100T200 := 00SB00DB * alpha T1 := T100T200 shl 24 T2 := (T100T200 shl 8) and $FF Result_R := RR00GG00 shl 24 Result_G := RR00GG00 shl 8 and$FF Result_B := T1 + (256-T2) 6 multiple reduced to 3 only, but more addition and shift operation invoke (5 addition,3 multiplication,4 shift operation, 2 and operation and 1 subtraction) Additional note, when preforming a 32 bit multiplication, the cpu will result in a 64 bit number, in x86 the upper 32 bit will fall to EDX and lower 32 bit will fall in EAX as I remember so we can take AH and DH directly for result if the above calculation is written in x86 asm on 32 bit proccesser shift operation can eliminate by some preprocessing or changing the data structure of a bitmap It should work better on a 64 bit proccesser which can proccess 4 color (a 32 bit pixel)in one calculation. 64bit version: 00SR00SG00SB : source pixel, color are packed in 16 bit,lower 8bit 00DR00DG00DB : Dest pixel, color are packed in 16 bit,lower 8bit RR00GG00BB00 : result pixel, color are packed in 16 bit,upper 8bit RR00GG00BB00 = 00SR00SG00SB * (256-alpha) + 00DR00DG00DB * alpha Result_R = RR00GG00BB00 shl 32 Result_G = RR00GG00BB00 shl 16 and FF Result_B = RR00GG00BB00 shl 8 and FF 2 multiplication, 3 shift operation only.. nice? PS. I havent proof the calculation will work on any implemention, but it will work better when you havent got any MMX or SSE If someone make it work, tell me it work how slow it does :P [Edited by - LYH1 on September 21, 2006 5:22:12 PM]

##### Share on other sites
Since you mention that it "should work better on 64bit", I'll assume you're talking about a modern desktop pc. In which case, why are you performing alpha blending on the cpu?

##### Share on other sites
Yes, 64 bit mean a 64 bit modern proccesser
why prefrom alphablending on cpu?
Because I cant find a api that run alpha blend pixel by pixel using H/W on delphi /_I am not falimary on C++ and having difficulty reading on DX header

##### Share on other sites
OpenGL will do alpha blending and is fairly easy to use from Delphi.