Voxel engine, storage of blocks

Started by
12 comments, last by CC Ricers 11 years ago

I have just started working on a voxel engine for the fun of it. Ive managed to create the voxels and apply those with a texture atlas into a VBO. I have performed a few optimizations such as dont draw blocks you cant see etc etc.

I'm storing the blocks such as


Block m_blocks[16][128][16]; 
 

for each chunk.

However the problem arises when I want to have an infinite terrain. I found a few resources online saying I need to use hashmap and or loading/saving to external file for this. I would just like to clarify is this the correct approach additionally any advice/tips on how to go about it would be very appreciated.

Thanks

Advertisement

if you are wanting to do something like Minecraft, then yes, saving the chunks to disk is probably a good idea.

also, given the potentially steep memory and storage costs that may pop up with voxel terrain, investigating the use of data-compression may also be a good idea.

FWIW: my engine uses compressed voxels both in their on-disk and in-memory format, typically decompressing and recompressing chunks as-needed (typically inactive chunks revert to a compressed form in-memory, and may be dynamically decompressed when the chunk is accessed).

at least for storage, Minecraft stores its chunks in a deflate-compressed form, typically also packaged up inside of "region" files (each of which hold a 32x32 grid of chunks).

I think you should go with a octree format for voxel data storage. The concept with an octree is that say you have a 5x5x5 cube of dirt blocks (125 blocks). Rather than save each and every single one, wouldn't you rather just save one large block the same size? It sounds sorta complicated, but once you understand how it works in a whole, its actually pretty straight forward.

I found this blog entry on the implementation of a Minecraft like game using octrees. It just so happens to also have some source code attached (at the bottom of the entry)... :) You might also want to see the rest of his entries as here as later he revises his octree format a bit.

Some favourite quotes:Never trust a computer you can't throw out a window.
- Steve Wozniak

The best way to prepare [to be a programmer] is to write programs, and to study great programs that other people have written.
- Bill Gates

There's always one more bug.
- Lubarsky's Law of Cybernetic Entomology

Think? Why think! We have computers to do that for us.
- Jean Rostand

Treat your password like your toothbrush. Don't let anybody else use it, and get a new one every six months.
- Clifford Stoll

To err is human - and to blame it on a computer is even more so.
- Robert Orben

Computing is not about computers any more. It is about living.
- Nicholas Negroponte

I have never implemented a voxel based engine, but instead of dense storage like this, why not try sparse storage like kdtrees or BVH and store blocks as cubes of 8x8x8 in the leaves ? (or some other constant adjusted emprically by try/measure).

The hash map idea is the same, but rather than a spatial tree structure (like kdtree/bvh) it would use location hashing.

There should not be much difference, both have imperfect densities, maybe the hash will be faster to access one leaf (because of amortized O(1) access) than the tree (O(log N)) but the neighborhood walking might be a tiny bit faster with the tree thanks to locality of storage (a brother is accessed through 2 indirections) but the need of rehash in the hash map solution makes neighborhood walking the same speed than random access in the first place. Sometimes walking is important (ray cast, collisions, volume fog...), it depends on what operation is more frequent. There are lots of ray tracing acceleration methods based on hybrids and mixes of many styles of acceleration structures, one of the most basic examples is Bresenham algorithm, it has optimization to walk super fast from cell to cell in a grid using ALU operations. The same idea exists in 3D with SSE and 3D grids, (called grid marching) some ray tracers are based on that. I recommend a bit of papers reading about that, even if they have nothing to do with voxels (purely structures for ray tracing purposes) it will still help to take decisions.

Such sparse structures will fill in memory for kilometers of terrain because basically, the storage requirement becomes a surface (and not a volume) * log(surface) for the tree solution. and * 1/charge factor for hash maps. For infinite worlds, it becomes possible to cluster in much larger sectors that gets streamd in/out on demand.

Also, there are ways to design super intelligent tree structures that you can access (read stream) partially and get a coarser LOD while not entirely read. (like progressive JPEG); like mip maps basically, same thing than the voxels of Cyril Crassin in the cone tracing paper, but thought for progressive serialization. this can be used for long range display, and fast stream in of distant clusters.

I would explore the octree approach, though that also comes with its own challenges of figuring out how to represent the octree in a structure so as to make it efficient to store and read as well, because the number of octree leaves for each chunk can differ greatly from one another.

Edit: Never mind, I realized you can store the octree data by assinging each leaf for each level a different number ID. This makes it much easier to understand and apply. All the solid areas can then be represented by the biggest leaf possible that encloses it by only storing these numbers. Numbering them, for example, starting with the top-most level (single leaf) is 0, next level down are 1-8, next level down 9 through 72, etc. Going down one level to subdivide a leaf by multiplying the current leaf's ID by 8, and then adding 1 through 8 to get the inner leaves.

New game in progress: Project SeedWorld

My development blog: Electronic Meteor

Thanks guys I had a look at both the suggested ideas such as Octree and RLE. Which do you think is a better choice?

RLE is better if there are a lot of the same types of blocks in a certain area. It keeps track of them in the following manner:

say you have 5 A blocks, 3 B blocks, and 1 C blocks) as follows:


AAAAABBBC

RLE would save that as:


5A3B1C

Notice that the Z takes up a single byte ('Z') in the first line, while it takes up two bytes ('1Z') in the second. This really becomes a problem when you encounter a lot of different object types.

In my opinion, Octrees are much more space and memory efficient, but they will be harder to parse (from the programmer's standpoint). The reason is that RLE must be decompressed in memory where as Octrees don't.

Some favourite quotes:Never trust a computer you can't throw out a window.
- Steve Wozniak

The best way to prepare [to be a programmer] is to write programs, and to study great programs that other people have written.
- Bill Gates

There's always one more bug.
- Lubarsky's Law of Cybernetic Entomology

Think? Why think! We have computers to do that for us.
- Jean Rostand

Treat your password like your toothbrush. Don't let anybody else use it, and get a new one every six months.
- Clifford Stoll

To err is human - and to blame it on a computer is even more so.
- Robert Orben

Computing is not about computers any more. It is about living.
- Nicholas Negroponte

RLE is better if there are a lot of the same types of blocks in a certain area. It keeps track of them in the following manner:

say you have 5 A blocks, 3 B blocks, and 1 C blocks) as follows:


AAAAABBBC

RLE would save that as:


5A3B1C

Notice that the Z takes up a single byte ('Z') in the first line, while it takes up two bytes ('1Z') in the second. This really becomes a problem when you encounter a lot of different object types.

In my opinion, Octrees are much more space and memory efficient, but they will be harder to parse (from the programmer's standpoint). The reason is that RLE must be decompressed in memory where as Octrees don't.

whether or not a single item takes 1 or 2 bytes depends on the type of RLE.

in many forms of RLE, a single item will usually only take a single byte.

for example, a PCX-like RLE:

0x00-0xBF: encoded directly

0xC0-0xFF: RLE run (1-64 bytes), followed by byte.

another strategy:

most bytes, passed through (except the escape byte).

0xFE=Escape byte

0xFE <count> <value> = RLE run

or, possibly, a compromise:

0x00-0xBF: passed through

0xC0-0xCF <value>: RLE run 1-16 (0xC0 <value> = Escaped Byte)

0xD0 <count> <value>: RLE Run (16-271)

0xD1 <count16> <value>: RLE Run (272-65536 / 65551)

0xD2-0xDF: reserved

0xE0-0xFF: passed through

in my engine, most of the voxel terrain is kept compressed in memory as well, but chunks may be decompressed on access, and revert to a compressed form later.

RLE is used here, mostly because it compresses/decompresses quickly.

block-level RLE can be faster than byte-level RLE, and avoids needing to flatten out or reconstitute the blocks (into a collection of "byte planes" or similar), but has the disadvantage that it doesn't compress as well (only blocks which are exactly the same may be compressed).

a more "fancy" idea I had also considered was basically doing something more PNG-like, basically using a Paeth predictor and Deflate, but this wouldn't be as usable for in-memory compression, due the higher encoding/decoding costs.

another trick is essentially skipping chunks which only contain a single type of voxel (such as air or stone), treating them like a single large block, but this works more in my engine because chunks are 16x16x16 (and stack vertically as well).

haven't personally messed with using octrees this way.

You're right, I forgot to mention that.

I personally use octrees all the time. I've only used RLE once in a really early project I did a few years back. I chose it over octrees back then since it was much easier to implement and I wasn't too comfortable with C++ at the time to work with octrees.

Some favourite quotes:Never trust a computer you can't throw out a window.
- Steve Wozniak

The best way to prepare [to be a programmer] is to write programs, and to study great programs that other people have written.
- Bill Gates

There's always one more bug.
- Lubarsky's Law of Cybernetic Entomology

Think? Why think! We have computers to do that for us.
- Jean Rostand

Treat your password like your toothbrush. Don't let anybody else use it, and get a new one every six months.
- Clifford Stoll

To err is human - and to blame it on a computer is even more so.
- Robert Orben

Computing is not about computers any more. It is about living.
- Nicholas Negroponte

You're right, I forgot to mention that.

I personally use octrees all the time. I've only used RLE once in a really early project I did a few years back. I chose it over octrees back then since it was much easier to implement and I wasn't too comfortable with C++ at the time to work with octrees.

Why do you use Octrees now?

This topic is closed to new replies.

Advertisement