I have a 'gizmo' in my level editor which I use to rotate, translate and scale objects. The gizmo is just a mesh which has 3 arrows pointing along each axis, you can hover over and click on the arrows to move, translate etc - standard stuff. The gizmo is drawn at the same position as the object it is targeting but is a set distance away from the camera so however near or far the object is, the gizmo stays the same size. This all works fine.
What I'm trying to do now is offer local, world and screen coordinate systems for rotating, translating, etc. local was simple. I basically borrow the world matrix from the targeted object (the thing we're rotating) - removing the scale. For world, even simpler as there's no rotation at all so it's just an identity matrix set to the position of the targeted object (again set at a certain distance away from the camera).
Screen space is causing me lots of headaches. My train of thought was that I would use the inverse view projection matrix of the camera which will effectively remove what will be added at the rendering stage leaving the object oriented in alignment with the screen. This does work to a certain extent, the orientation of the gizmo stays in the right position inside the object and in an orientation aligned with the screen no matter where my camera is situated. The problem is, the view projection matrix intrinsically contains scale (for perspective) which causes my gizmo to get smaller the further away from the object I am - even though the position of the gizmo is still that set distance away from the camera.
I've tried lots of different methods of removing scale but it always messes up the rendering of the gizmo. Is there a better way or a step I'm missing?