Technically I can misuse any data-structure. A vertex is a point with edges to other points. I was just reacting to the comprehension part. Premature optimization. I remember engines which could only do flat shaded polygons. I do not know why current API do make the hack of vertex normals easier to use then the mathematical sound way of using the normal of the polygon. And I do not think that this is the case.

The GPU's concept of a vertex is "a tuple of attributes", such as position/normal/texture-coordinate -- the mathematical definition doesn't really apply

A GPU vertex doesn't even need to include position! When drawing curved shapes, "GPU vertices" are actually "control points" and not vertices at all.

There's also no native way to supply per-primitive data to the GPU -- such as supplying positions per-vertex and normals per-face. Implementing per-face attributes is actually harder (and requires more computation time) than supplying attributes per vertex, because the native "input assembler" only has the concept of per-vertex and per-instance attributes.

You could treat vertex as triangle and expand it at geometry shader. It just bit cumbersome and performance would be awful.