Greetings, fellow programmers!
I’m looking for some help with MMX alpha blending (color blending) using inline assembly in a C/C++ environment. I’m pretty much new to assembly language, so not surprisingly I hit a snag pretty close to the finish line.
In C++ syntax, I’m using the following algorithm for blending colors:
dst_component += (src_alpha *(src_component –dst_component) >> 8)
… for each component in ‘dst’.
Just for the record; the color components are stored in 32-bit unsigned integers (DWORD) - one byte for each component. Using the MMX unpack instructions I unpack every component into a WORD inside a QWORD.
The following code is what I’ve come up with:
unsigned int ZERO = 0;
__asm{
movd mm0,dst ; mm0 = dst
punpcklbw mm0,ZERO ; unpack mm0
movd mm1,src ; mm1 = src
punpcklbw mm1,ZERO ; unpack mm1
psubusw mm1,mm0 ; mm1 -= mm0
; multiply by alpha – PROBLEM HERE
pslrw mm1,8 ; mm1 >>= 8
paddusw mm1,mm0 ; mm1 +mm0
packusbw mm1,ZERO ; pack mm1
movd dst,mm1 ; dst = mm1
}
My main problem for the time being is how to insert the ‘src’ alpha into a 64-bit unsigned integer (QWORD) four times – one 16-bit unsigned integer each (WORD), like so:
---
Byte 1: src alpha } Word 1
Byte 2: 0
Byte 3: src alpha } Word 2
Byte 4: 0
Byte 5: src alpha } Word 3
Byte 6: 0
Byte 7: src alpha } Word 4
Byte 8: 0
---
Quite frankly, I have no idea how to do this with existing MMX unpack instructions. I guess I could do it in C++, but I can’t find any elegant/efficient way to do it there either.
Also, I’m not sure about the MMX multiply instructions. Do you have to make two multiplications per 64-bit value in order to get a correct result, i.e. one multiplication per DWORD?
I don’t even know if I’m going about this entire problem correctly, so feel free to give me suggestions! Would much appreciate your help!