Hello

I have read a bunch of articles and tutorials about normal mapping, many of them focus on how to create them but not that many understandably describe them as they as used in the graphics pipeline.

I understand that the Normals of a high-res model is saved as a pixel in a texture. And applying this texture to a low-res model to get per-pixel normals.

I understand that these Normals are in high-res model space. And for them to work, the low-res model must be aligned with the high-res model (same position, rotation and scaling).

I understand that if the low-res Model is rotated, that the Normals in the normap map will remain in high-res space and there will be a mismatch.

What I don't understand is why do we call it tangent space instead of just "original high-res model space", and why is it continously changing for each fragment? One article says that each normal is in the space of its individual triangle, does that mean that in the high-res model, each face has its own space, and the neighbouring face has a different space? Why not just one space for all faces?

Why can't I just take the light vector, transform it with to model space using the inverse-model transformation matrix?

What I can think of is that, when I animate a character each face will be transformed differently. This means that for each face the normals will be transformed differently, from the bone matrices. Why can't I just take the light vector, transform it to the "face-space" using the inverse-bone matrices, and use the normal map from there on? Why do we involve tangent, bitangent using UVs?

It is all just a big mess to me, if anyone can describe it please do so.

Cheers!