CTM * ScaleMatrix will get one result, ScaleMatrix * CTM will get another. One will be applying the scale "inside" the coordinate system defined by CTM, the other will scale around the world axes. I'm not sure which is which offhand, but I'm pretty sure "inside" is what you want. Then a Y scale matrix will scale the local Y axis, so it will have the same effect whether it's done before or after a rotation.

Something that took me a long time to notice is that a row-major rotate/scale matrix is really just an array of axis vectors. In 3D, with a 3x3 matrix, the first 3 elements are the X axis (1,0,0 in an identity matrix), middle 3 are the Y axis (0,1,0 identity), and last 3 are the Z axis (0,0,1). After rotating, those axis vectors will be pointing in different directions, but they'll still be orthogonal to eachother. And the length of each axis vector is the scale for that axis. Instead of applying a Y scale matrix, you could cheat and simply multiply the middle 3 elements by the Y scale you want

In 2D, the first 2 elements are the X axis (1,0) and the second two are the Y axis (0,1), and the same rules apply. It's actually simpler, but a bit harder to see the beauty of it in 2D, at least for me.