Let's think about this.
Let's say you've got millions of voxels in your terrain environment. You realize that there's a performance issue with rendering them at a high level of detail, so you want to render the nearby voxels at a lower level of detail if they're far away from the camera. This is the gist of what we're trying to do. But we don't want to spend a huge amount of CPU time trying to figure out which level of detail we want to use for a voxel either! In the most brute force method for the algorithm, we'd calculate the distance from the camera position to the voxel position. If that distance exceeds a set threshold, you drop the LOD by one. However, if we used the brute force method, we'd be calculating the distance between the camera and all million voxels (which comes with an expensive square root) and any performance gains in switching to another LOD would be much lower. So, we want to keep the same idea for calculating distances with voxels, but try to use fewer distance computations.
Here's where Octrees come in handy, and that's why you are reading about them. You can divide your terrain into chunks of blocks, say, 16x16x16 (use powers of 2 if you can). Each chunk can be inserted into your octree. When you're going to calculate the camera distance, you could instead calculate the camera distance to the octree bounding regions and set the LOD of all contained objects to a preset value. This would help you reduce the number of distance checks, and would also scale very well with any number of game world objects (I assume you're going to have more than just terrain).
I don't know if its relevant to you or not, but there was a white paper a while back on rendering terrain using geomipmaps. The author had an interesting technique for deciding when to switch to a different LOD which was not based on camera distance, but rather on how much the terrain would pop if you transitioned it to a lower LOD (2.3.1). He basically measures the vertical change in "pop" between one LOD and another in screen space according to the camera viewing angle, and then switches the terrain LOD if the pop is below some acceptable threshold (ie, 2 pixels). I implemented a variation of this myself and I like the results.
Here is my code for that:
/// <summary>
/// This calculates the maximum amount of vertical error when switching from one level of detail to another.
/// </summary>
private void CalcError()
{
for (int LoD = 0; LoD < 4; LoD++)
{
int stepSize = (int)Math.Pow(2, LoD + 1);
float d_max = 0;
//traverse horizontally
for (int z = 0; z < m_settings.TileCount; z++)
{
for (int x = 0; x < m_settings.TileCount; x += stepSize)
{
Vector3 p0 = m_verts[x + (z * (m_settings.TileCount + 1))].Position;
Vector3 p1 = m_verts[(x + (stepSize / 2)) + (z * (m_settings.TileCount + 1))].Position;
Vector3 p2 = m_verts[x + stepSize + (z * (m_settings.TileCount + 1))].Position;
Vector3 P = (p0 + p2) / 2.0f; //the phantom position of p1 is just the average of P0 and P2
float d = (p1 - P).Length(); //find the error difference between p1 and the phantom point
if (d > d_max)
d_max = d;
}
}
m_max_dy[LoD + 1] = d_max; //this is the most error we'd get if we switched from LODX -> LODX+1
}
}
On screen and in the game, what ends up happening is that we try to use the lowest possible LOD we can get away with without getting unwanted popping artifacts. So, if you're pointing your camera straight down and you're viewing the terrain from above, the very bottom terrain chunk will be at a very low LOD, but you can't really tell since the vertical portions are not really coming into play based on your viewing angle.