I was hoping someone could clear something up for me...

In 3D space, to obtain the camera view transform matrix, I believe it is scale * rotation * translation matrix.

However in 2D space (I've been working on a small project) I found it is actually the translation matrix * scale matrix, or the camera sometimes comes up with odd behavior.

After doing some searching online, I found for 3D camera problems the correct sequence was what I put above, same for 2D. I was wondering can someone offer a quick explanation as to why that is?

I'm just confused as to why you scale first then move the sprite for 3D, and move first then scale the model for 2D, since all 2D is missing is the Z axis.

Additional Info: I'm coding in MonoGame (XNA)