The main goal of Geometry Clipmaps is enabling you to draw terrains with higher resolution than you can fit in GPU memory (and still have space for the rest of the game assets).
Example: The Witcher 3 height map is 23552x23552 = ~1GB (0.37m resolution, I think there's a typo in the presentation). Clearly too much memory if you want other assets in your game.
You *probably* don't need full resolution height data to draw distant mountains, so you can use a clipmap.
The top layer in blue is the full res height map. The blue layers below it are the mips (each half res of the previous one).
But you only load into GPU memory the green areas (centered around the camera).
Continuing using The Witcher 3 as example:
It uses 5 clip maps each with resolution 1024x1024 (eg: texture array).
1st layer - Full res - 1024 * 0.37m = 378m around the camera (in each direction)
2nd later - Half res - 1024 * 0.74m = 757m around the camera. (0.74m because it's half res so each pixel corresponds to double the distance)
5th layer - 1/16 res - 1024 * 5.92m ~ 6km around the camera
Full map is 23552 * 0.37 ~8.7km, so they're able to draw most of the map using only 1024*1024*5*2 bytes ~ 10 Mb height data.
Since you have all the data you need in the clip map, you don't need as many vertices as the size of the heightmap to render the terrain. Just create a 16x16 patch of vertices (15x15 rectangles), don't need any uvs, and reuse it to draw the terrain. In the vertex shader calculate the world position of the current vertex and use that to sample the clipmap in the correct position/layer.
Since you only have full res height data close to the camera, as soon as you start to use different layers of the clipmap render patches doubling the distance between vertices so each vertex matches one pixel of terrain data.
You'll run into problems in the borders when you start to render patches using a different layer because the height data won't match perfectly.
This GPU Gems article explains the types of patches to use and how to hide the seams between patches with different levels of detail.
When the camera moves (enough) you need to update some layers of the texture clipmap with data from the full map texture you have on disk. Doing toroidal access allows you to only update parts of the texture instead of having to move the parts that are still relevant over old parts and fill the "empty" space with new height data.
Some useful links:
http://www.vertexasylum.com/downloads/cdlod/cdlod_latest.pdf (good solution for seams between layers, I use a modified version of this in my demos and it works very well) link to full source code in the end of paper)