The normal matrix is created via first transposing and then inverting the upper 3×3 part of the world (or world-view) matrix.
normalMat = inverse(transpose(upper33(worldMat))
While it does remove translations, it does not remove scales. Its purpose is to correct the directions the normals face under non-uniform scaling. That is, if the model scales up along its X axis by 2 times, without the inverse transpose matrix the normal will also lean further away from 0 along the X axis, when in fact it should actually point more and more towards 0 along the X axis as the model’s X scaling increases.
Using the inverse transpose reverses the scaling to make this happen.
In short, the inverse-transpose matrix is meant to correct artifacts under non-uniform scaling conditions. It does not remove scaling.
In order to remove scaling you have to either renormalize the vector after multiplying it by the inverse transpose matrix (most common) or renormalize the 3 rows of the inverse-transpose matrix (faster).
L. Spiro