Increase calculation + render speed for terrain generation

Started by
5 comments, last by Steve_Segreto 11 years, 8 months ago
Hey,

im trying to make an terrain engine wich will generate a nice terrain randomly everytime the game starts. Im using a mix of perlin noise, midpoint displacement and voronoi. Im also using the erosion algrotihm from this paper: http://oddlabs.com/d..._generation.pdf

It works fine for pretty small terrains (128x128) but the loading time increases drastically when making larger terrains. I think that i need 4096x4096 at max, but this takes over 10 minutes with like 600MB Ram used to calculate this. If it is finally rendered it runs with like 120 fps due to a quadtree i am using.

I've looked into my code to find out what calculation is taking so long and it turns out that this is the order:

1. erosion (this takes i would guess 80% of the time, because it works with 3 for loops (from 0 to size)
2. vornoi (10%)
3. quadtree calculation (10%)

these values are only estimated. Now im curious if there are any better (faster) ways to do this.
If you need any code to examine the problem , please let me know!

EDIT:

for 1024x1024,
voronoi takes ~40seconds
quadtree calculation takes ~40seconds.

Thanks!
Advertisement
When you say that erosion "works with 3 for loops", do you mean nested loops? If so, then it's not surprising that it takes a while! It has to do 4096[sup]3[/sup] passes in the inner loop, which is a gigantic number! Otherwise, if they are not nested loops, I would suggest to use a profiler to find what's really using the CPU.
Yes they are nested loops. It wasn't surprising for me aswell, but maybe someone knows a better (faster) way to simulate the erosion effect, or even do this on the GPU.
If you can't throw more threads at it and can't optimize it, you may be out of luck and GPU might be your best option. Also, do you need 600MB of memory, really? Make sure you free stuff as soon as you no longer need it instead of letting it pile up and freeing it all in one go.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

So i was able to optimize some things. The long loading time was mostly because the boost multi array lib seems to be really slow in debug mode. Got my memory usage down to like 200-300MB and looking to get it down further more. Now im questioning my original implementation of the terrain. I am generating my terrain randomly, while saving the result in a heightMap like so:


for(int j=0; j< m_terrainHeight; j++)
{
for(int i=0; i< m_terrainWidth; i++)
{
index = (m_terrainHeight * j) + i;
m_heightMap[index].x = (float)i;
m_heightMap[index].y = (float)height.getPixel(i, j)*75.f;
m_heightMap[index].z = (float)j;
}
}


The height object should get out of scope after the generation and with it the memory used for this should get freed as well. After that im building up my vertex array like so:


m_vertexCount = (m_terrainWidth - 1) * (m_terrainHeight - 1) * 6;
// Create the vertex array.
m_vertices = new VertexType[m_vertexCount];
if(!m_vertices)
{
return false;
}

index = 0;
// Load the vertex array with the terrain data.
for(j=0; j<(m_terrainHeight-1); j++)
{
for(i=0; i<(m_terrainWidth-1); i++)
{
index1 = (m_terrainHeight * j) + i; // Bottom left.
index2 = (m_terrainHeight * j) + (i+1); // Bottom right.
index3 = (m_terrainHeight * (j+1)) + i; // Upper left.
index4 = (m_terrainHeight * (j+1)) + (i+1); // Upper right.
// Upper left.
tv = m_heightMap[index3].tv;
// Modify the texture coordinates to cover the top edge.
if(tv == 1.0f) { tv = 0.0f; }
m_vertices[index].position = D3DXVECTOR3(m_heightMap[index3].x, m_heightMap[index3].y, m_heightMap[index3].z);
m_vertices[index].texture = D3DXVECTOR2(m_heightMap[index3].tu, tv);
m_vertices[index].normal = D3DXVECTOR3(m_heightMap[index3].nx, m_heightMap[index3].ny, m_heightMap[index3].nz);
index++;
// Upper right.
tu = m_heightMap[index4].tu;
tv = m_heightMap[index4].tv;
// Modify the texture coordinates to cover the top and right edge.
if(tu == 0.0f) { tu = 1.0f; }
if(tv == 1.0f) { tv = 0.0f; }
m_vertices[index].position = D3DXVECTOR3(m_heightMap[index4].x, m_heightMap[index4].y, m_heightMap[index4].z);
m_vertices[index].texture = D3DXVECTOR2(tu, tv);
m_vertices[index].normal = D3DXVECTOR3(m_heightMap[index4].nx, m_heightMap[index4].ny, m_heightMap[index4].nz);
index++;
// Bottom left.
m_vertices[index].position = D3DXVECTOR3(m_heightMap[index1].x, m_heightMap[index1].y, m_heightMap[index1].z);
m_vertices[index].texture = D3DXVECTOR2(m_heightMap[index1].tu, m_heightMap[index1].tv);
m_vertices[index].normal = D3DXVECTOR3(m_heightMap[index1].nx, m_heightMap[index1].ny, m_heightMap[index1].nz);
index++;
// Bottom left.
m_vertices[index].position = D3DXVECTOR3(m_heightMap[index1].x, m_heightMap[index1].y, m_heightMap[index1].z);
m_vertices[index].texture = D3DXVECTOR2(m_heightMap[index1].tu, m_heightMap[index1].tv);
m_vertices[index].normal = D3DXVECTOR3(m_heightMap[index1].nx, m_heightMap[index1].ny, m_heightMap[index1].nz);
index++;
// Upper right.
tu = m_heightMap[index4].tu;
tv = m_heightMap[index4].tv;
// Modify the texture coordinates to cover the top and right edge.
if(tu == 0.0f) { tu = 1.0f; }
if(tv == 1.0f) { tv = 0.0f; }
m_vertices[index].position = D3DXVECTOR3(m_heightMap[index4].x, m_heightMap[index4].y, m_heightMap[index4].z);
m_vertices[index].texture = D3DXVECTOR2(tu, tv);
m_vertices[index].normal = D3DXVECTOR3(m_heightMap[index4].nx, m_heightMap[index4].ny, m_heightMap[index4].nz);
index++;
// Bottom right.
tu = m_heightMap[index2].tu;
// Modify the texture coordinates to cover the right edge.
if(tu == 0.0f) { tu = 1.0f; }
m_vertices[index].position = D3DXVECTOR3(m_heightMap[index2].x, m_heightMap[index2].y, m_heightMap[index2].z);
m_vertices[index].texture = D3DXVECTOR2(tu, m_heightMap[index2].tv);
m_vertices[index].normal = D3DXVECTOR3(m_heightMap[index2].nx, m_heightMap[index2].ny, m_heightMap[index2].nz);
index++;
}
}


This is the first thing where i am not entirely sure if this is the best way to do this. Using 6 vertices per quad seems a bit like overkill to me, but as you can see later the structure here is important for building up my quadtree and to calculate the texturecoordinates correctly. This code was partly taken from "http://www.rastertek.com/tutterr.html" especially the calculation of the texturecoordinates. I understand how the calculation is done, but what i am not quite sure of, is why we need to check them here again (is quite another topic, but maybe someone could explain this to me). So after that QuadTree is getting build. First of all it copys the content of the vertexarray for its usage (im deleting the original array after the quadtree is build up).

So basically every node in the tree gets its own buffer filled like this:


node->triangleCount = numTriangles;
// Calculate the number of vertices.
vertexCount = numTriangles * 3;
// Create the vertex array.
vertices = new VertexType[vertexCount];
// Create the index array.
indices = new unsigned long[vertexCount];
// Create the vertex array.
node->vertexArray = new VectorType[vertexCount];
// Initialize the index for this new vertex and index array.
index = 0;
// Go through all the triangles in the vertex list.
for(i=0; i<m_triangleCount; i++)
{
// If the triangle is inside this node then add it to the vertex array.
result = IsTriangleContained(i, positionX, positionZ, width);
if(result == true)
{
// Calculate the index into the terrain vertex list.
vertexIndex = i * 3;

// Get the three vertices of this triangle from the vertex list.
vertices[index].position = m_vertexList[vertexIndex].position;
vertices[index].texture = m_vertexList[vertexIndex].texture;
vertices[index].normal = m_vertexList[vertexIndex].normal;
indices[index] = index;
index++;
vertexIndex++;



vertices[index].position = m_vertexList[vertexIndex].position;
vertices[index].texture = m_vertexList[vertexIndex].texture;
vertices[index].normal = m_vertexList[vertexIndex].normal;
indices[index] = index;
index++;
vertexIndex++;

vertices[index].position = m_vertexList[vertexIndex].position;
vertices[index].texture = m_vertexList[vertexIndex].texture;
vertices[index].normal = m_vertexList[vertexIndex].normal;
indices[index] = index;
index++;
}
}


So IsTriangleContained(i, positionX, positionZ, width); just checks if the current triangle "i" is contained in current node of the tree so that it has to go into the buffer.
If it is contained the triangle is build easily because of the structure used in the vertexarray before.

Im pretty sure there is much more space to optimize this, if you see any chance to do just that, please let me know!. Thanks a lot.
Bump and an additional question:

What is the best way to handle the quadtree rendering: Using one large VertexBuffer for the whole terrain and creating an indexbuffer for each node?

What is the best way to handle the quadtree rendering: Using one large VertexBuffer for the whole terrain and creating an indexbuffer for each node


Here's something I did with DX9 and it provided very good performance for me.

I sub-divided the terrain into patches (say 32x32 quads each) and then use two vertex streams.
The first vertex stream has the (x,y,z) and (u,v) values for a single 32x32 quad patch. This is fed by a single static vertex buffer that never changes its values.
The second vertex stream gets fed by a dynamic vertex buffer and holds a single float for height and (x,y,z) for normal vector, the number of elements matches the first vertex stream (i.e. one patch).

At render time you simply instance your single patch geometry all over the screen to create the larger terrain, switch to second vertex stream, lock dynamic vb, update height array and normal array and render the two streams using a terrain patch vertex/pixel shader.

The beauty of this approach is that the index buffer and vertex buffer for the first stream is small and never changes, you don't need an index buffer for the 2nd stream and you can just lock-n-update the 2nd stream whenever you render a different terrain patch.

This topic is closed to new replies.

Advertisement