A good heightmap LOD technique (no VTF please)

Started by
27 comments, last by skytiger 12 years, 2 months ago

[quote name='dave j' timestamp='1328905598' post='4911778']
The original geometry clipmaps paper didn't use VTF.
It's more work on the CPU though - so it still might not be usable.

There's a footnote on that page with a link to here and an explanation that they were able to move pretty much all the work to the GPU.
[/quote]


I know - but that requires VFT which the OP doesn't want to use. Hence the link to the original paper.
Advertisement

[quote name='Cornstalks' timestamp='1328907181' post='4911784']
[quote name='dave j' timestamp='1328905598' post='4911778']
The original geometry clipmaps paper didn't use VTF.
It's more work on the CPU though - so it still might not be usable.

There's a footnote on that page with a link to here and an explanation that they were able to move pretty much all the work to the GPU.
[/quote]


I know - but that requires VFT which the OP doesn't want to use. Hence the link to the original paper.
[/quote]
You're totally right. This is when doing a quick text search for "VFT" in a paper, and assuming that since it doesn't contain "VFT," it must not use VFT, is a stupid assumption. Thanks.
[size=2][ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]
There is also CDLOD.
I'm not quite sold on that. By the way, the author presents it as "Continuous Distance-Dependent Level of Detail for Rendering Heightmaps" but at its core it's just an interpolated discrete lod. Both geomipmapping and geoclipmapping feature some kind of interpolation: that does not make it a continuous metod.
Perhaps I am misunderstanding it big way but...
Let me quote:
[size=1]{1} Using CDLOD, each vertex is morphed individually based on its own LOD metric unlike the method in [Ulrich 02], where the morph is performed per-node (per-chunk).
...
{2} First, the approximate distance between the observer and the vertex is calculated to determine the amount of morph required
...
{3} Finally, the z-component is obtained by sampling the heightmap with texture coordinates calculated from x and y components using a bilinear filter (filtering is only needed for vertices in the morph region). When all vertices of a node are morphed to this low-detail state, the mesh then effectively contains four times fewer triangles and exactly matches the one from the lower LOD layer; hence it can be seamlessly replaced by it.
...
{4} Settings used to generate the quadtree and run the algorithm need to be carefully chosen to match terrain dataset characteristics while providing the best performance. In the accompanying examples, each dataset defines its own settings.
...
{5} Since the LOD works in three dimensions, this problem will be enhanced when using extremely rough terrain with large height differences: thus, different settings might be required for each dataset.
In the provided data examples, LOD settings are tuned so that the ranges are always acceptable...
...
{6} Two example projects are provided: BasicCDLOD and StreamingCDLOD. Both projects are written in C++, using DirectX9 and HLSL, and should work on most GPUs that support vertex shader texture sampling, i.e., ones supporting Shader Model 3.0
...
{7} The performance bottleneck on the GPU is either the vertex texture fetch cost (used for displacement of terrain vertices), or the triangle rasterization and pixel shader cost. This mostly depends on the settings, GPU hardware, and display resolution.
[/quote]

  1. He's actually suggesting geoclipmaps to be bad? Or limited? This is a bit odd to me. Nyquist theorem anyone?
  2. I seriously hope this is not going to happen on CPU! The only possible way to do so is VTF.
  3. Please note how he's basically admitting the algorithm to be interpolated across discrete states. Having worked on a method that was based on displacing vertices similarly, I am surprised it does not produce consistent wobble.
  4. I don't know what he means by carefully. I am against everything that requires careful tuning. Note: neither geomipmapping nor geoclipmaps require to be carefully tuned in my opinion (albeit clipmapping requires some tuning and streaming needs tuning as well).
  5. Note similar issues happen when using geomipmapping (probably because it's doing something very similar?)
  6. Geomipmapping runs on everything. If SM3 is required, I suppose I might just bite the bullet and use clipmapping.
  7. Sure, it would have been nice to post the shaders. A simple fill will give no troubles to anyone.

Anyway, I downloaded the 210MiBs required to look at the system. It required me to build the data set on my system. The non-interactive program uses a single core and took like 2 minutes to run. It then took a few minutes (I'd say close to 15m) at runtime to start. Again, single-core. The program blowed about 210 MiBs of RAM to produce around 1 Gig of data. When running, it takes about 180 MiBs.
The method has been "tuned" so most triangles are about... I'd say 4 pixels wide.
Basically what it shows is the algorithm converges to a correct result... at an extreme cost.
I'd take it with some salt.

I look forward to tune this iteratively!

Previously "Krohm"

at its core it's just an interpolated discrete lod. Both geomipmapping and geoclipmapping feature some kind of interpolation: that does not make it a continuous metod.


There's a big difference though, the interpolation method is seamless and continuous (thus the name): one segment is morphed completely into another before it's 'swapped', with no seams between LOD levels. Geoclipmaps, geomipmaps (and ChunkedLOD) have T seams between LOD levels and require 'stitching strips' on one hand, and on the other the morphing method only works (reasonably) well for heights but you're still swapping between two different meshes so there's inevitable popping if you're vertex fetching and trying to interpolate other data types (such as normals).
(Another difference is that CDLOD interpolates completely predictably based on the 3D distance between the vertex and the camera, unlike geoclipmaps/chunkedlod.)


I seriously hope this is not going to happen on CPU! The only possible way to do so is VTF.[/quote]

No, of course, vertex texture fetching is used, read the paper.


Please note how he's basically admitting the algorithm to be interpolated across discrete states. Having worked on a method that was based on displacing vertices similarly, I am surprised it does not produce consistent wobble.[/quote]

It doesn't wobble because it's not morphing it in a simplistic way: all morph levels are pre-generated to avoid that. (Seriously, try reading the paper? smile.png )

I don't understand what would satisfy your definition of a continuous algorithm - ROAM maybe? Whatever you do it still has to be discretized into render calls unless you can store all the heightmap, normalmap and other data into one big buffer and render from it using tessellation or something similar?


Anyway, I downloaded the 210MiBs required to look at the system. It required me to build the data set on my system. The non-interactive program uses a single core and took like 2 minutes to run. It then took a few minutes (I'd say close to 15m) at runtime to start. Again, single-core. The program blowed about 210 MiBs of RAM to produce around 1 Gig of data. When running, it takes about 180 MiBs.

The method has been "tuned" so most triangles are about... I'd say 4 pixels wide.[/quote]

Well if the triangles are 4 pixels wide on your device then why not try dropping the quality level? You're hardware is probably not the 'target audience' (for example, on my monitor they look around 10ish pixels wide, probably the same dataset): there's a button on the HUD saying "Render grid size:" - drop that one notch and reduce the viewing distance. If you want to save on memory, drop the HeightmapLODOffset and NormalmapLODOffset by one, should reduce the memory use to approx 30-40% of the current. Also, the demo you mention uses a texture splatting technique which must take at least 10MiB RAM without any terrain data loaded - remove that and you've saved some more.

[quote name='Lee Stripp' timestamp='1328927490' post='4911858']
Great paper, has anyone thought of using disc shapes instead of grids and just scale with view height?

Hi Lee smile.png , the point of using a grid is its simplicity... How would you implement something similar to the L-Shaped strip that is used to reduce flickering of vertices?

There is also CDLOD.
[/quote]

I was thinking simple.. build the disc shape in advance, the natral expanding of the shape itself creates LOD (But do this with some care) anyway and then split into sections for frustum cull.. I was just making an observation, I haven't done any tests on that at all.

Cheers

Lee

Code is Life
C/C++, ObjC Programmer, 3D Artist, Media Production

Website : www.leestripp.com
Skype : lee.stripp@bigpond.com
Gmail : leestripp@gmail.com

CDLOD paper was interesting BUT
he makes no mention of what data is transferred, how and when to the GPU
I can't believe he is transferring a screen full every frame?
my current sketch for VTF-less terrain works by loading all the potentially visible vertices into the gpu
using a vertex buffer managed as a heap of blocks, with deallocated blocks waiting 3 frames before being made available for reallocation (to save buffer space on gpu)
the index buffer is tiny and just contains a generic pattern for each LOD level (draw with a vertexoffset to address the correct block of vertices)
this means I only send terrain data to the GPU when it becomes potentially visible
(using a simple modulo grid - which is also used for frustum culling - just project the frustum into "grid space" and you have a list of visible blocks - essentially spatial index query by rasterization!)
(and changing the LOD level of a block just comes down to changing the indexbuffer start offset)
the downside is a large number of draw calls - which I don't think is a real problem as there are no resource changes between them
given the choice of 50 draw calls versus sending 100,000 vertices ... (too lazy to actually profile that right now)
Here is an idea that is simple to code and highly performant:

1) create a static vertex buffer with the same layout as the height texture (scan lines of texels become contiguous vertices)
2) create a static index buffer containing a disk split into 8 segments (so each disk can be rendered by setting index buffer offset and count) this disk has a concentric LOD pattern
3) now you can move the disk around the height map by calculating vertex offset = pos.x + pos.z * stride and using 1 draw call for each section of the disk that is visible

for a square terrain of 1024 texels and a vertex size of 16 bytes you will need 16MiB, for 2048 64MiB

you can use ushort4 for position and ushort4 for normal - leaving 3 ushorts spare for other things ... texture coords are derived from position

this is very similar to my vtf terrain - but I am swapping vtf for using more video memory

Here is an idea that is simple to code and highly performant:

1) create a static vertex buffer with the same layout as the height texture (scan lines of texels become contiguous vertices)
2) create a static index buffer containing a disk split into 8 segments (so each disk can be rendered by setting index buffer offset and count) this disk has a concentric LOD pattern
3) now you can move the disk around the height map by calculating vertex offset = pos.x + pos.z * stride and using 1 draw call for each section of the disk that is visible

for a square terrain of 1024 texels and a vertex size of 16 bytes you will need 16MiB, for 2048 64MiB

you can use ushort4 for position and ushort4 for normal - leaving 3 ushorts spare for other things ... texture coords are derived from position

this is very similar to my vtf terrain - but I am swapping vtf for using more video memory


That is exactly what [s]geomipmapping[/s] geoclipmapping is. You don't mention how you update the vertex heights; in the original algorithm heights were calculated on the cpu, that was later moved to the gpu by using vtf.

That is exactly what geomipmapping is. You don't mention how you update the vertex heights; in the original algorithm heights were calculated on the cpu, that was later moved to the gpu by using vtf.


You need to re-read my post. It isn't geomipmapping.

The reason I didn't mention HOW I update the vertex heights is because I DON'T update them, ever. This is 100% static technique.

(edit) a pseudocode sample will help illuminate things:

DrawIndexedPrimitives(
BaseVertexIndex = (use this to position the mesh in world space using = Origin.X + Origin.Z * Stride)
, StartIndex = (use this to pick a segment of the disk // the index buffer includes 360 + 180 degrees of disk)
, PrimitiveCount = (use this to pick how many segments to render)
)


The index buffer is a "pattern" that is positioned over the the uniform grid of vertices - I have moved the height data from a texture into a vertex buffer and instead of sampling it I use offsets to map indices to the correct vertices.

There is a good explanation here:
http://msdn.microsoft.com/en-us/library/windows/desktop/bb147325%28v=vs.85%29.aspx
Ah, I misunderstood what you said. So instead of storing a grid of heights in a texture you store in a vertex buffer.

In that case, what are the 8 segments that you split a disk into?

This topic is closed to new replies.

Advertisement