This is indeed a bottleneck for me. I suspect this has to do with how many memory read/write operations happen at low level. The problem is that it's using up a lot of CPU power for larger images (1000x1000)
As an alternative, how can I use the GPU to obtain only the red channel? I'm currently rendering the BGRA bitmap image in Direct3D9 and obtaining it to system memory using GetRenderTargetData() and LockRect() then copying the Red Channel using the above method.
Is there any Direct3D way of copying only Red channel to system memory?