Jump to content
  • Advertisement
Sign in to follow this  
Halsafar

OpenGL Terrain + Vertex Buffers + Index buffer + LOD + Draw Calls

This topic is 2500 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

A long question with a likely long complicated answer. For context I am using OpenGL 3.3+.

I am already able to take a height field and generate a terrain. I've subdivided it into a quad tree structure for frustum culling. The method is simple and likely a performance killer. Each leaf in the quad tree has its own vertex buffer and index buffer (so VOA object in OpenGL). When I traverse the tree I end up having many draw calls (bad). There is no LOD yet.

What is the best approach here to gain some performance but also get LOD in there?

One approach would be to traverse the quad tree building a dynamic vertex buffer each frame. This will get me one draw call but seems awfully slow. It might solve problems with a giant list of vertices where I can queue up vertices up to some max, draw, then queue the rest and draw.

Another approach. I could use a global vertex buffer for the entire terrain (set once, reside on GPU). This single global vertex buffer seems problematic for large terrain. The height fields are very large, 2048x2048 for example. Each quad tree leaf can contain a set of index buffers, one for each LOD.

I believe each index buffer for each LOD is identical for each node. So I need to just store a small list of index buffers to select from when rendering a node.

This logic becomes circular as I try and choose between multiple draw calls or batching into fewer draw calls. Should this even be a concern.

I googled around a bunch, read some articles. There seems to be a ton of approaches. I became rather confused on when to combine what methods. I'm hoping someone here can scope my thoughts, push me towards one method to try out.

Any articles you have found useful covering this stuff would help.


Thanks!

Share this post


Link to post
Share on other sites
Advertisement
I'm currently looking at Geometry Clipmaps (Hoppe). It has a really nice, fast and intuitive lod and frustrum culling system. It uses vertex texture lookups and only a few draw calls per frame.

http://research.microsoft.com/en-us/um/people/hoppe/proj/gpugcm/

Share this post


Link to post
Share on other sites

I'm currently looking at Geometry Clipmaps (Hoppe). It has a really nice, fast and intuitive lod and frustrum culling system. It uses vertex texture lookups and only a few draw calls per frame.

http://research.micr...pe/proj/gpugcm/


Just finished reading the paper. It does sound very impressive.

I was originally reading the follow article and planning to implement it but the one you present seems much better.
http://www.gamasutra.com/view/feature/1754/binary_triangle_trees_for_terrain_.php

Share this post


Link to post
Share on other sites

On draw call count.

Having a lot of draw calls is not so bad in GL.... at least it was not when I stopped using it. Regardless of what performance might be... is performance enough?
If memory serves GL is (or was) very fast at issuing draw calls, the difference against D3D9 was incredible for short batches.
I'd just keep going for the time being, benchmark and see.

I use direct3D 9 currently so batches can often be a problem for me (I have some Atoms around, those are incredibly limiting). What I did is to put every lod level in the same buffer. That way, if two nearby tiles have the same lod, I can merge two batches into one.
The method does not always works, I'd say the performance benefit has been lower than expected. Drivers have also improved and changing buffers is not much of a problem nowadays, even on old drivers (but I still save on them for the good of Atoms).

On big terrains and reductions.

I'm not really on the dynamic buffer approach. Keep everything static first if possible.
I'm afraid it is necessary to write a second terrain system for large terrains. If the asset fits the memory, use a static, chunked mode. If it doesn't use streaming geoclipmaps.
It is worth noticing that to partition an IB nobody requires to actually split it in multiple IBs, just working on the draw call data will deliver the same (albeit I might be misunderstanding what you're writing).
For example consider four chunks
AB
CD
If their indices are sequential and they end in the same LOD bucket then they can be drawn in a single call, where [font=courier new,courier,monospace]DrawCall(ABCD)[/font] can be trivially inferred.
It is also possible to do it the other way around, figure out [font=courier new,courier,monospace]DrawCall(A)[/font] from [font=courier new,courier,monospace]DrawCall(ABCD)[/font], but I'd prefer the former. Edited by Krohm

Share this post


Link to post
Share on other sites

A long question with a likely long complicated answer. For context I am using OpenGL 3.3+.

I am already able to take a height field and generate a terrain. I've subdivided it into a quad tree structure for frustum culling. The method is simple and likely a performance killer. Each leaf in the quad tree has its own vertex buffer and index buffer (so VOA object in OpenGL). When I traverse the tree I end up having many draw calls (bad). There is no LOD yet.

What is the best approach here to gain some performance but also get LOD in there?

One approach would be to traverse the quad tree building a dynamic vertex buffer each frame. This will get me one draw call but seems awfully slow. It might solve problems with a giant list of vertices where I can queue up vertices up to some max, draw, then queue the rest and draw.

Another approach. I could use a global vertex buffer for the entire terrain (set once, reside on GPU). This single global vertex buffer seems problematic for large terrain. The height fields are very large, 2048x2048 for example. Each quad tree leaf can contain a set of index buffers, one for each LOD.

I believe each index buffer for each LOD is identical for each node. So I need to just store a small list of index buffers to select from when rendering a node.

This logic becomes circular as I try and choose between multiple draw calls or batching into fewer draw calls. Should this even be a concern.

I googled around a bunch, read some articles. There seems to be a ton of approaches. I became rather confused on when to combine what methods. I'm hoping someone here can scope my thoughts, push me towards one method to try out.

Any articles you have found useful covering this stuff would help.


Thanks!

Here's what I do.
1. My terrain is subdivided in patches.
2. I cache up to X patches in videomemory (LRU cache)
3. My cache consists of a single VBO(space for X patches) and a single IBO
4. The trick is, that the IBO contains every LOD combination possible (see below)
5. When rendering the terrain I only bind the VBO/IBO once and draw the terrain with the according offsets (per patch a single draw call)

I use only a simple LOD scheme by halfing the resolution of a single patch (mipmapping). The trick is, that you need only a single VBO holding all vertices, but the indicies are skiping more and more vertices for lower lod levels.

Example:
Lod-0
0 1 2 3 ..
4 5 6 7 ..
8 9 a b ..
c d e f ..
Lod-1
0 2 ..
8 a ..

Lod-2
0 ...
...


Then I pre-calculate possible lod-combinations, i.e. when the center patch has a LOD of 0 and the surrounding patches (n,e,s,w) has a LOD of 1 , you need to adjust your indicies to get a proper seam between the patches. Let's say you have up to 4 LOD levels (going too low will kill the shape of the terrain), thats 2 bits. With 5 patches (center,n,e,s,w) you need 10 bits, that are 1024 different patch setups.
The index of the correct patch setup is easily calcualed by the LOD level of the current and surrounding patches:

LC = LOD level of current patch
LN = LOD level of north patch
.. east
.. south
.. west
IBO_lookup_index = LN<<8 | LE<<6 | LS<<4 | LW<<2 | LC;
int IBO_offset = IBO_LOOK_UP_TABEL[IBO_lookup_index]

Share this post


Link to post
Share on other sites
My implementation of terrain rendering is to not use any fancy tricks -- the code is very simple.

I have ONE vertex buffer of 66x66 ( the extra two are for a singe overlapped reigion on each side of the grid.
ONE index buffer

Then, you create a single component float vertex buffer that is also 66 x 66. When the viewer moves enough, I update the height buffers (which takes about .007 seconds to update all).

I lay out the drawing of the grid in rings with differing scale, the first ring is of scale 1, next, 2, next 4, next 8 and so on. This way, the max amount of draw calls is very small. In my terrain it takes 9 to 22 draw calls per frame to draw the entire terrain.

Check out the article at

http://nolimitsdesigns.com/game-design/terrain-how-to/

Share this post


Link to post
Share on other sites
I've refactored our terrain system a few times here, and the latest versions took a design from the texture swizzling that goes on behind the to make textures cache-coherent on GPUs:

Each letter represents a cell out of the terrain, in my case a 6x6 block of quads

Each block is frustum culled and rendered, however before it is rendered, the indicies are compared with the previous node in the block to see if it connects, so that it can merge the draw calls:


A - B
/
C - D


Also, this data structure is recursive in nature, which maps to a quad tree directly.

A typical terrain only ends up being between 5 and 15 draw calls (with no state changes so it will remain fast), and culls a large portion of the off-screen triangles, and only needs fine-grained testing for cells that touch the edges of the screen.

The same system is used to render dynamic lights (in the forward rendered version of the engine) without rendering the whole terrain, in a fairly efficient manner.

Share this post


Link to post
Share on other sites
My technique uses static mesh and VTF for height displacement.
Because there is no mesh generation etc. there is no "speed limit" on the camera.
The camera can jump from anywhere to anywhere instantly.
And I can increase the LOD level during camera zoom.
Mesh generators have to either zoom in to a low poly view or wait to generate a high poly view.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!