The idea is that you don't use the vertices to describe your mesh.
You use indices to describe your mesh.
At the end, to actually render a rectangle, you have to give OpenGL two triangles, which are six vertices. So you can either give OpenGL the six vertices up front, or you can give it the four vertices that actually define the rectangle, and tell GL "I gave you four vertices, I want you to make two triangles out of them like this index buffer tells you".
When using indices, GL simply grabs the vertices from your vertex buffer. Your first value in the index buffer is 0? ok, let's grab the first value in the vertex buffer. The second value is 1? grab the second value in the vertex buffer. 2? grab the third. Now we grabbed three vertices, which form the first triangle.
Then it continues with indices 0, 2 and 3. Note that we reused two vertices - 0, and 2. Even though we sent them only once, we actually used them twice.
While we only need four vertices to represent this rectangle, the graphics cards wants triangles, each one having three vertices.
So if we were to split the above rectangle into triangles, we would need to send six coordinates instead of four.
Another way to do this, is send only those four vertices, but together with them also tell the graphics card how to form triangles from them.
This is where the index (element in OpenGL) buffer comes in.
The element buffer has numbers that index your vertices.
E.g. 0 would be the first vertex, 1 the second, and so on.
So with an element buffer, to form the triangles needed for the rectangle, we need to send these indices: [0, 1, 2, 0, 2, 3]. If you replace the numbers with the actual vertices they index, you will see you get the original six vertices to form the triangles.
Indexing reduces memory and bandwidth (except for very uncommon worst case scenarios), which in turn help rendering speed.
Most file formats (to which you export from Blender, 3ds Max, etc.) support indexing, but there are two variants of indexing for file formats.
In modeling tools (and in fact, in your OpenGL code too!), a "vertex" isn't a position, it's a combination of a position, a normal vector, a color, a texture coordinate, and so on.
Every one of these things is called a vertex attribute, and is only one part of the whole vertex.
OpenGL (and Direct3D) only allow 1D indexing, or in other words - you have one index that points to all the vertex attributes.
For example, if you have an array of vertex positions and another array of vertex normals, then index 0 would be the first position and the first normal.
This might seem obvious, but some file formats don't actually store their data this way.
In most cases, a model doesn't actually need the same amount of positions, normals, and so on.
If there are many vertices that have the same normal, the file might store only one normal, and let them all share it.
You then have different indices for each vertex attribute, which you can't directly use for rendering.
In this case, you will have to "flatten" the arrays and fill them up with all the shared data.
This can be seen in the completely obsolete, bad format *.OBJ (it's the most terrible format in existence, but for some reason it's used everywhere).
There are versions of the instanced drawing functions that take a range of vertices.
The general idea (I believe) is to cache static objects together, ones that you know will never move anyway. But, this also hinders you with culling them. I assume google can give more information, I never had to handle big scenes as of yet.
Sending positions, transformations, etc. to batches (and instanced draws) is really an application-specific thing, but you'll usually use something to identify each mesh (done for you in instanced rendering), and based on that select the correct data from a uniform buffer / texture buffer / whatever.
VAOs are mainly for convenience, I am not sure if they actually improve performance, and if so it's probably by a little. They are used to store the current state of your context related to rendering (so vertex and element VBO bindings, shader attribute bindinds, etc.), so that when you want to draw something you need to only bind the VAO and it binds the context in it for you.