yeah when blasting geometry at the GPU you generally format it so you have fat vertices each with their attributes (like position, normal, texture coordinate, color, etc) all packed together one after the other. so you have one buffer of pos0, normal0, texcoord0, color0, pos1, normal1, texcoord1, color1, pos2, normal2.... you get the idea. then another buffer that are indices for your triangles so like 0, 1, 2, 0, 2, 3, ... each index here then grabs the contiguous blob of data for the vertex at that index.
you'll find a lot of software like max and maya and formats like OBJ don't specify meshes this way. they'll specify them as you describe where each face or triangle has separate indices for each vertex component. so positions 0,1,2 and normals 6,1,3 and texcoords 4,3,1.
when you load data like this you do indeed have to duplicate vertices when the same vertex is referenced but with different attributes. while this does mean potentially having many duplicate copies of attributes (like the texture coordinate 0,0 used by many vertices) it keeps things a bit simpler and avoids the GPU running all around memory gathering up the attributes from many different locations and also keeps the APIs a bit simpler as well. in the general case of a big complex mesh created by an artist there's unlikely to be much savings in indexing attributes separately.