Okay so is this largely unchanged in dx11?
by calcing transforms in software that means you have to push a full transform for each bitmap. Seems to me like you could instead pass translates and calc transforms on the GPU.
and compute shaders wouldnt be useful? or passing in points and reconstructing the quad for the bitmap in a geometry shader?????
come on guys i know some of you out there can think up tweaks!
The only way you could batch that way is to pass an array of matrices to your vertex shader along with some id for each image that tells you which index of the array you want to use for each image. You either do it on the CPU side or waste Shader Registers. Pick your poison