Here's some more ideas:
Is the shader in the D3D11 version more complicated than just doing a texture read?
Try turning alpha testing off in D3D9 (or on in D3D11 by using clip() in the shader). That can have a significant performance impact as it cuts down on the need to blend pixels with the frame buffer. The performance difference will depend on how transparent the texture is, and the card you're testing on.
It's also possible that the D3D9 version is rejecting pixels due to the depth testing or stencil testing. Make sure those are disabled in both cases.
The D3D9 code does not use a shader at all, just fixed function.
I removed alpha testing/blending, and I'm still getting the same results (well, not 100% the same, instead of 22 FPS in the D3D9 version, I'm getting 19 FPS because there's nothing to reject), and the D3D11 version is still horrifically slow. I'm starting to think this is something related to Direct3D 11 in general and/or my ATI drivers (using the latest version 11.12).
Hi,
I assume that you enabled the Direct3d debug libraries and studied the output (if any) from the program.
Best regards
Yes, I have, and no there's nothing of consequence being reported.