GPU Terrain generation, cell noise, rivers, crater

Published October 25, 2008
Advertisement
GPU Planetary Generation

Motivation

Until now, the planetary generation algorithm was running on the CPU synchronously. This means that each time the camera zoomed in on the surface of the planet, each terrain node was getting split into 4 children, and a heightmap was generated synchronously for each child.

Synchronous generation means that rendering is paused until the data is generated for each child node. We're talking of 10-20 milliseconds here, so it's not that slow; but since 4 childs are generated at a time, those numbers are always multiplied by 4. So the cost is around 40-80 ms per node that is getting split. Unfortunately, splits happen in cascade, so it's not rare to have no split at all during one second, and suddenly 2 or 3 nodes get split, resulting in a pause of hundreds of milliseconds in the rendering.

I've addressed this issue by adding asynchronous CPU terrain generation: a thread is used to generate data at its own rythm, and the rendering isn't affected too harshly anymore. This required to introduce new concepts and new interfaces ( like a data generation interface ) to the engine, which took many weeks.

After that, I prepared a new data generation interface that uses the GPU instead of the CPU. To make it short, I encountered a lot of practical issues with it, like PBOs ( pixel buffer objects ) not behaving as expected on some video cards, or the lack of synchronization extension on ATI cards ( I ended up using occlusion queries with an empty query to know when a texture has been rendered ), but now it's more or less working.

Benefits

There are a lot of advantages to generating data on the GPU instead of the CPU: the main one is that, thanks to the higher performance, I will now be able to generate normal maps for the terrain, which was too slow before. This will increase lighting and texturing accuracy, and make planets ( especially when seen from orbit ) much nicer. Until now, planets seen from space weren't looking too good due to per-vertex texturing; noise and clouds helped to hide the problem a bit, but if you look carefully at the old screenshots, you'll see what I mean.

The second advantage is that I can increase the complexity of the generation algorithm itself, and introduce new noise basis types, in particular the cell noise ( see Voronoi diagrams on wikipedia ).

Another advantage is debug time. Previously, playing with planetary algorithms and parameters was taking a lot of time: changing some parameters, recompiling, launching the client, spawning a planet, moving the camera around the planet to see how it looks, rinse and repeat. Now I can just change the shader code, it gets automatically reloaded by the engine and the planet updates on-the-fly: no need to quit the client and recompile. It's a lot easier to play with new planets, experiment, change parameters, etc..

I'm not generating normal maps yet ( I will probably work on that next week ), and there's no texturing; in the coming pictures, all the planet pictures you will see only show the heightmap ( grayscale ) shaded with atmospheric scattering, and set to blue below the water threshold. As incredible as it sounds, normal mapping or diffuse/specular textures are not in yet.

Cell noise

.. aka Voronoi diagrams. The standard implementation on the cpu uses a precomputed table containing N points, and when sampling a 3D coordinate, checking the 1 or 2 closest distances to each of the N points. The brute-force implementation is quite slow, but it's possible to optimize it by adding a lookup grid. Now, doing all of that on the GPU isn't easy, but fortunately there's a simpler alternative: procedurally generating the sample points on-the-fly.

The only thing needed is a 2D texture that contains random values from 0 to 1 in the red/green/blue/alpha channels; nothing else. We can then use a randomization function that takes 3D integer coordinates and returns a 4D random vector:

vec4 gpuGetCell3D(const in int x, const in int y, const in int z){	float u = (x + y * 31) / 256.0;	float v = (z - x * 3) / 256.0;	return(texture2D(cellRandTex, vec2(u, v)));}


The cellNoise function then samples the 27 adjacent cells around the sample point, generate a cell position in 3D given the cell coordinates, and get the distance to the sample point. Note that distances are squared until the last moment to save calculations:

vec2 gpuCellNoise3D(const in vec3 xyz){	int xi = int(floor(xyz.x));	int yi = int(floor(xyz.y));	int zi = int(floor(xyz.z));	float xf = xyz.x - float(xi);	float yf = xyz.y - float(yi);	float zf = xyz.z - float(zi);	float dist1 = 9999999.0;	float dist2 = 9999999.0;	vec3 cell;	for (int z = -1; z <= 1; z++)	{		for (int y = -1; y <= 1; y++)		{			for (int x = -1; x <= 1; x++)			{				cell = gpuGetCell3D(xi + x, yi + y, zi + z).xyz;				cell.x += (float(x) - xf);				cell.y += (float(y) - yf);				cell.z += (float(z) - zf);				float dist = dot(cell, cell);				if (dist < dist1)				{					dist2 = dist1;					dist1 = dist;				}				else if (dist < dist2)				{					dist2 = dist;				}			}		}	}	return vec2(sqrt(dist1), sqrt(dist2));}


The two closest distances are returned, so you can use F1 and F2 functions ( ex.: F2 = value.y - value.x ). It's in 3D, which is perfect for planets, so seams won't be visible between planetary faces:

gpugen4_med.jpg

New planetary features

Using the cell noise and the GPU terrain generation, I'm now able to create new interesting planetary shapes and features. Have a look yourself:

Rivers

"Fake" rivers I'm afraid, as it's only using the ocean-level threshold and they don't flow from high altitudes to low altitudes, but it's better than nothing. When seen from orbit, there is some aliasing, so not all pixels of a river can be seen.

It's simply some cell noise with the input displaced by a fractal ( 4 octaves ):

gpugen1_med.jpg

gpugen2_med.jpg

gpugen3_med.jpg

Craters

I've started to experiment on craters. It's a variation of cell noise, with 2 differences: extinction ( a density value is passed to the function, which is used to kill a certain number of cells ), and instead of returning the distance, return a function of the distance. This function of distance is modeled to generate a circular, crater-like look.

Here's a quick experiment with 90% extinction. The inputs are also displaced with a small fractal:

gpugen6_med.jpg

And here's the result with a stronger displacement:

gpugen5_med.jpg

The next step is to add more octaves of crater noise:

gpugen7_med.jpg

It doesn't look too good yet, mostly because the craters at different octaves are just additively added and not combined properly. More experiments on that later.

Planetary experiments

When adding more octaves and combining different functions together, then adding back atmosphere scattering and the ocean threshold, the results start to look interesting. Keep in mind that all the following pictures are just the grayscale heightmap, and nothing else: no normal mapping or no texturing yet !

gpugen8_med.jpg

gpugen9_med.jpg

gpugen10_med.jpg

gpugen11_med.jpg

gpugen12_med.jpg

gpugen13_med.jpg

gpugen14_med.jpg

gpugen15_med.jpg
Previous Entry A bit of history
0 likes 17 comments

Comments

Twisol
Stunning, as always. :D
October 25, 2008 05:49 PM
rip-off
I would be happy if I could produce a single image that cool, let alone a truckload of them...
October 25, 2008 08:02 PM
rip-off
I would be happy if I could produce a single image that cool, let alone a truckload of them...
October 25, 2008 08:04 PM
johnhattan
I rather like the one with big round patches floating like clouds. It has sort of an unusual otherworldly appearance.
October 25, 2008 08:09 PM
thomastc
Pretty cool stuff! I'm surprised that it's still fast enough, with 27 texture samplings per heightmap pixel... on the other hand, what do I know about GPU programming :)

What does your crater function look like, currently? I'm asking because the sea seems to flow around the craters, which means the centre of the crater is higher than the land around it. I guess part of the function (i.e. in the centre of the crater) should be negative, then a positive border, then zero outside the crater area, or something like that. But maybe that's what you meant with not combining them properly yet...

Edit: I was also wondering how you do the synchronization, with D3D not being thread-safe. Or is the GPU terrain generation synchronous again?
October 26, 2008 12:10 AM
AndyPandyV2
Nice stuff Ysaneya.

Have you run into any precision problems when you get near the surface of the planet? If I'm not mistaken once you get down near 20 octaves(assuming amplitude drops by 1/2 each time) your nearing values that a float can't really represent.
October 26, 2008 12:38 AM
Ysaneya
thomastc: Yep, 27 texture samples but that's per octave, so in total there can be a lot more than that. I haven't checked the complexity of the shaders in assembly yet, but I'm pretty sure they'll be thousands of instructions.

You are right on the crater function, that's why it tends to create plateaus, I will fix it later.

The GPU terrain generation is not synchronous in relation to the CPU, but it is on the GPU (meaning it's always the main thread that sends GPU commands) so there's no problem. I'm using OpenGL btw.

AndyPandyV2: Yes, those precision problems can be seen as strange random noise when you zoom very close to the planet. On an Earth-like planet (6350 km radius), it seems to be around half a meter. I'm not sure yet if it'll be a problem, or if I'll go back to geometry on CPU async + normal maps on GPU (there are no precision problems on the CPU version).
October 26, 2008 06:10 AM
thomastc
I see, thanks for the response! (I thought you wrote about getting fed up with OpenGL and switching to D3D for this project a couple years ago, but I might be mistaken.)

As to the float precision... a float has a precision of slightly more than 7 decimal digits. On a radius of 6350 km (approximately) this is indeed around half a metre. However, maybe you could interpret the heightmap as an offset from a certain "base radius" of 6350 km. Considering the height of the tallest mountain on Earth is about 8 km, this gives you a precision of around 1 mm... that should be plenty accurate enough. Even if you want planets with mountains of hundreds of kilometres high.

However, this would only allow you to get a greater precision in the heightmap itself. The GPU probably does everything still in 32 bits, so simply adding 6350 km to the heightmap in a vertex shader might not work. Maybe you can get around this somehow? For example, pretending that the planet is flat as long as the camera is close to it? (When the camera is not close, a half-metre inaccuracy will not be a problem anyway.)
October 26, 2008 01:26 PM
Ysaneya
You are correct about the precision of altitude, and it's already calculated compared to sea level. The problem here isn't precision of altitude, but precision of surface coordinates. The noise often goes in the 15-20 octaves, so even starting with low frequencies, at the last octaves you hit floating point precision.
October 26, 2008 02:08 PM
bladerunner627
Amazing stuff, it's not often you see 3d renderings as sexy as these. The amount of detail you put into this is mind blowing.
October 29, 2008 01:22 AM
petrocket
Amazing. My biggest question is - if you're doing the terrain generation on the GPU, how will you handle physics and collision detection per frame? Do you have a rough approximation of the mesh on the CPU side that you use for collision detection?
October 29, 2008 12:39 PM
ildave1
In your code, how are you using that 'in' keyword?

vec4 gpuGetCell3D(const in int x, ...)
October 30, 2008 08:38 PM
Ysaneya
Quote:Original post by petrocket
Amazing. My biggest question is - if you're doing the terrain generation on the GPU, how will you handle physics and collision detection per frame? Do you have a rough approximation of the mesh on the CPU side that you use for collision detection?


That's why I mentionned occlusion queries used as fences: the geometry is generated on the gpu and asynchronously downloaded to the CPU (when it wants to work, thanks to NVidia and ATI :p) when the gpu has processed the buffer. Then on the CPU the data is used to recreate the mesh and perform other calculations.
October 31, 2008 04:56 AM
toneburst
Hiya!

First of all, let me say your work looks amazing! Very impressive stuff!!

I wonder if you could spare a minute or so to help out a GLSL shader newbie.

I'm attempting to get your GPU Voronoi shader working. I have an 256x256px RGBA texture with random 0.0 > 1.0 values in each of the four channels.

I've then setup a very simple Fragment Shader main loop:

void main()
{
	vec2 voronoi = gpuCellNoise3D(gl_TexCoord[0].xyz);
	
	//Multiply color by texture
	gl_FragColor = vec4(vec3(voronoi[0]),1.0);
}


that calls your gpuCellNoise3D function.

Unfortunately, I'm just getting a very regular-looking pattern, with one large square cell, not the one I see in your screenshot. I'm sure there's something obvious I'm missing. I'm not trying anything as ambitious as creating procedurally-textured planets, but I'd love to be able to generate a Voronoi surface pattern on a flat billboard.

Any pointers very gratefully accepted.

Cheers,

a|x
http://machinesdontcare.wordpress.com
December 09, 2008 11:49 AM
toneburst
Quote:Original post by ildave1
In your code, how are you using that 'in' keyword?

vec4 gpuGetCell3D(const in int x, ...)


Hiya,

I'm not the author of the post, but I have done little bits and pieces of GLSL. The 'in' keyword is supposed to be used to indicate variables that go in to a function, but will not be returned by it. You can also use 'out' and 'inout'.

The code should work fine without the 'in's, in this case.

Hope this helps,

a|x
http://machinesdontcare.wordpress.com

December 10, 2008 04:24 AM
toneburst
Hi,

I've got it working now. I think I must have made some kind of typo somewhere in my code, or something.

I wonder if you could clarify one little point for me though:

Why do you multiply by 31 and 3 in the gpuGetCell3D function? It seems to work without this multiplication.

Sorry if this is a stupid question.

Cheers,

a|x
December 11, 2008 05:08 AM
Ysaneya
Quote:Original post by toneburst
Why do you multiply by 31 and 3 in the gpuGetCell3D function? It seems to work without this multiplication.


Those constants are used to avoid repeating patterns in the noise, ie. to randomize the inputs a bit. You can use different constants (or even none) but then you have to experiment at higher frequencies to make sure you don't see any visible pattern.
December 11, 2008 03:59 PM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!
Profile
Author
Advertisement
Advertisement