Motivation
Until now, the planetary generation algorithm was running on the CPU synchronously. This means that each time the camera zoomed in on the surface of the planet, each terrain node was getting split into 4 children, and a heightmap was generated synchronously for each child.
Synchronous generation means that rendering is paused until the data is generated for each child node. We're talking of 10-20 milliseconds here, so it's not that slow; but since 4 childs are generated at a time, those numbers are always multiplied by 4. So the cost is around 40-80 ms per node that is getting split. Unfortunately, splits happen in cascade, so it's not rare to have no split at all during one second, and suddenly 2 or 3 nodes get split, resulting in a pause of hundreds of milliseconds in the rendering.
I've addressed this issue by adding asynchronous CPU terrain generation: a thread is used to generate data at its own rythm, and the rendering isn't affected too harshly anymore. This required to introduce new concepts and new interfaces ( like a data generation interface ) to the engine, which took many weeks.
After that, I prepared a new data generation interface that uses the GPU instead of the CPU. To make it short, I encountered a lot of practical issues with it, like PBOs ( pixel buffer objects ) not behaving as expected on some video cards, or the lack of synchronization extension on ATI cards ( I ended up using occlusion queries with an empty query to know when a texture has been rendered ), but now it's more or less working.
Benefits
There are a lot of advantages to generating data on the GPU instead of the CPU: the main one is that, thanks to the higher performance, I will now be able to generate normal maps for the terrain, which was too slow before. This will increase lighting and texturing accuracy, and make planets ( especially when seen from orbit ) much nicer. Until now, planets seen from space weren't looking too good due to per-vertex texturing; noise and clouds helped to hide the problem a bit, but if you look carefully at the old screenshots, you'll see what I mean.
The second advantage is that I can increase the complexity of the generation algorithm itself, and introduce new noise basis types, in particular the cell noise ( see Voronoi diagrams on wikipedia ).
Another advantage is debug time. Previously, playing with planetary algorithms and parameters was taking a lot of time: changing some parameters, recompiling, launching the client, spawning a planet, moving the camera around the planet to see how it looks, rinse and repeat. Now I can just change the shader code, it gets automatically reloaded by the engine and the planet updates on-the-fly: no need to quit the client and recompile. It's a lot easier to play with new planets, experiment, change parameters, etc..
I'm not generating normal maps yet ( I will probably work on that next week ), and there's no texturing; in the coming pictures, all the planet pictures you will see only show the heightmap ( grayscale ) shaded with atmospheric scattering, and set to blue below the water threshold. As incredible as it sounds, normal mapping or diffuse/specular textures are not in yet.
Cell noise
.. aka Voronoi diagrams. The standard implementation on the cpu uses a precomputed table containing N points, and when sampling a 3D coordinate, checking the 1 or 2 closest distances to each of the N points. The brute-force implementation is quite slow, but it's possible to optimize it by adding a lookup grid. Now, doing all of that on the GPU isn't easy, but fortunately there's a simpler alternative: procedurally generating the sample points on-the-fly.
The only thing needed is a 2D texture that contains random values from 0 to 1 in the red/green/blue/alpha channels; nothing else. We can then use a randomization function that takes 3D integer coordinates and returns a 4D random vector:
vec4 gpuGetCell3D(const in int x, const in int y, const in int z){ float u = (x + y * 31) / 256.0; float v = (z - x * 3) / 256.0; return(texture2D(cellRandTex, vec2(u, v)));}
The cellNoise function then samples the 27 adjacent cells around the sample point, generate a cell position in 3D given the cell coordinates, and get the distance to the sample point. Note that distances are squared until the last moment to save calculations:
vec2 gpuCellNoise3D(const in vec3 xyz){ int xi = int(floor(xyz.x)); int yi = int(floor(xyz.y)); int zi = int(floor(xyz.z)); float xf = xyz.x - float(xi); float yf = xyz.y - float(yi); float zf = xyz.z - float(zi); float dist1 = 9999999.0; float dist2 = 9999999.0; vec3 cell; for (int z = -1; z <= 1; z++) { for (int y = -1; y <= 1; y++) { for (int x = -1; x <= 1; x++) { cell = gpuGetCell3D(xi + x, yi + y, zi + z).xyz; cell.x += (float(x) - xf); cell.y += (float(y) - yf); cell.z += (float(z) - zf); float dist = dot(cell, cell); if (dist < dist1) { dist2 = dist1; dist1 = dist; } else if (dist < dist2) { dist2 = dist; } } } } return vec2(sqrt(dist1), sqrt(dist2));}
The two closest distances are returned, so you can use F1 and F2 functions ( ex.: F2 = value.y - value.x ). It's in 3D, which is perfect for planets, so seams won't be visible between planetary faces:
New planetary features
Using the cell noise and the GPU terrain generation, I'm now able to create new interesting planetary shapes and features. Have a look yourself:
Rivers
"Fake" rivers I'm afraid, as it's only using the ocean-level threshold and they don't flow from high altitudes to low altitudes, but it's better than nothing. When seen from orbit, there is some aliasing, so not all pixels of a river can be seen.
It's simply some cell noise with the input displaced by a fractal ( 4 octaves ):
Craters
I've started to experiment on craters. It's a variation of cell noise, with 2 differences: extinction ( a density value is passed to the function, which is used to kill a certain number of cells ), and instead of returning the distance, return a function of the distance. This function of distance is modeled to generate a circular, crater-like look.
Here's a quick experiment with 90% extinction. The inputs are also displaced with a small fractal:
And here's the result with a stronger displacement:
The next step is to add more octaves of crater noise:
It doesn't look too good yet, mostly because the craters at different octaves are just additively added and not combined properly. More experiments on that later.
Planetary experiments
When adding more octaves and combining different functions together, then adding back atmosphere scattering and the ocean threshold, the results start to look interesting. Keep in mind that all the following pictures are just the grayscale heightmap, and nothing else: no normal mapping or no texturing yet !