How good are the best voxel compression algorithms?

Started by
14 comments, last by owl 12 years, 10 months ago
This is basically an informal sort of poll, but it is based on a comment that I made in the unlimited detail thread. The main motivation for asking this question should be self evident; since the ultimate bottleneck in any voxel based system is memory requirements. Better compression means higher detail. larger environments, and ultimately better performance (due to fewer cache misses and less data to move around). Also for distributed voxel engines, memory is the key limiting factor in the amount of bandwidth used.

This suggests that the biggest, and most important problem that any new voxel engine needs to solve is *not* how to max out rendering performance, but rather to figure out how they are going to efficiently compress all that volumetric data. In fact, somewhat counter intuitively I would suggest that highly compressed voxel data may even be *faster* to render than uncompressed data. This was certainly true back in the old days of sprite based rendering, where using run-length encoded sprites would allow you to blit them to the screen much more quickly than doing a brute force bit-by-bit memcpy. The reason for this is that modern CPUs and GPUs are primarily bandwidth limited, and so the trade-off between memory vs. computation is becoming increasingly lopsided in favor of the latter.

One common technique for encoding voxel data is to use either run-length encoding or octrees. The underlying assumption in each case is that the voxel data is mostly piecewise constant, and so you only need to keep track of the places where it transitions between these different regions. The main reason that this works is that the boundary for the object is much smaller than the actual object (typically on the order of O(n^{(d-1)/d}) in d-dimensions). However, if you take this idea to the ultimate conclusion, then the best that you can reasonably hope for is to compress the voxel data down to just the information encoded on the boundary; or in other words reduce it to a B-rep. This is slightly annoying, since it seems to just take you back to polygon/B-rep objects again :(

So, I am asking this question to try to figure out if it is actually possible to do much better than this? There are a bunch of `crank' videos on youtube with people claiming to do so (ie Unlimited Detail and Atomengine), but there are many reasons to be skeptical about their claims (such as a lack of independent verification, and questionable motivations on their part to attract investors). So, are there any credible research results which really try to honestly investigate this problem? The work by Sven Forstmann (spacerat on these forums) comes to mind, but I don't think he has written anything lately. There is also sparse voxel octrees, but I am not convinced that using a simple octree compression scheme will get you anything better than what can already be done with B-reps (see the above comment).

What are your thoughts on this issue?
Advertisement
Ive done a bit of work here, and basicly I left my voxel project because just like you say I couldnt figure out how to compress it enough.
I got it down to about a byte and a half positional data (with a surface representation and a runlength encosion of axis aligned blocks) with a basic fractal mountainside, but thats well not enough compression.
My world would take up hundreds of gigs of disk space, so its not viable.
I think you need to know how png compression works, because you compress voxels in a similar way, apparently Jon Olick says he can get it down to a bit and a half positional data!!! (thats more like it) but the internals of a png file are a bit beyond me so I decided to switch over to another common geometry compression algorythm, displacement mapping, then png compression is easy because all you have to do is save the graphic and presto its compressed nicely for you... I just am having trouble developing a decent displacement map at decent enough speed, but thats another story.
I am fully into this "unique geometry and world" idea though, my plan is to bake it all with offline global illumination, but im not even close to getting that far yet. :)

Ive done a bit of work here, and basicly I left my voxel project because just like you say I couldnt figure out how to compress it enough.
I got it down to about a byte and a half positional data (with a surface representation and a runlength encosion of axis aligned blocks) with a basic fractal mountainside, but thats well not enough compression.
My world would take up hundreds of gigs of disk space, so its not viable.
I think you need to know how png compression works, because you compress voxels in a similar way, apparently Jon Olick says he can get it down to a bit and a half positional data!!! (thats more like it) but the internals of a png file are a bit beyond me so I decided to switch over to another common geometry compression algorythm, displacement mapping, then png compression is easy because all you have to do is save the graphic and presto its compressed nicely for you... I just am having trouble developing a decent displacement map at decent enough speed, but thats another story.
I am fully into this "unique geometry and world" idea though, my plan is to bake it all with offline global illumination, but im not even close to getting that far yet. :)




PNG is still basically run-length encoding. The only difference is that the runs can be preprocessed using a handful of predefined filters to make them more efficiently compressible. For example, one of the preprocessing options is to take difference series of each pixel value before doing the run-length encoding; as a result this lets you run-length encode horizontal gradients very efficiently. Other filters are more sophisticated, like the Paeth filter[1], or various types of median filters, but the overall idea is basically the same. As a result, PNG tends to work well with flat images, linear gradients, dithering or other types of simple repeating patterns. It tends to work very badly for images which have a lot of higher order regions, like cubic or greater surface patches. PNG type compression *might* be a good idea for a voxel engine, but I am still not convinced that it can beat the B-rep barrier. The reason is that at the end of the day, you still have to store the start/stop of each run, which means that you are basically storing the boundary of the object anyway. Other compression formats like JPEG could conceivably do better than this by being able to make lossy compression trade offs, but I am not sure what the overall cost would be at the end; or if it would even look `good enough' for a real game engine.

[1]: [font="sans-serif"] Paeth. "Image File Compression Made Easy", Graphics Gems II[/font]
thanks for reference, ill check it out, and I freely admit im not very knowledgeable about compression, there is a way to do it (unless Jon Olick is full of shit), but I cant tell you, like I said, I left the project, I failed... someone at devmaster said Jon used something similar to png compression, but im only repeating what I was told without understanding pretty much... sorry...
There has got to be literature on this. If only SIGGRAPH proceedings weren't locked up behind a paywall I'd suggest you search there (got a university library nearby you can use? They have subscriptions.) A quick google turns up e.g. this, which, from skimming the abstract, looks like a pretty straightforward application of vector quantization to me; maybe it'd be useful? I don't know what the canonical "good" papers in this field are.

I can throw some ideas out here.

1.) Consider other basis expansions (e.g., wavelets*, b-splines). Remember that even an octree can be viewed as an expansion in terms of a {0,1}-valued basis. Maybe you should just throw a really huge set of basis vectors at your problem, and minimize the 1-norm of the expansion coefficients in order to get a sparse representation?

* You might want to look at JPEG2000, in particular the volumetric JP3D format. A quick google turns up this library; whether it's in a usable state I don't know.

2.) I do like the vector quantization idea. Bigger blocks = more compression.

3.) Another thing to consider is just designing formats that use abstractions similar to the ones that a designer would use to design the level to begin with. This touches on "procedural content generation," but I mean something less extreme. When you thought about this landscape you designed, did you think about all however-many heightmap values, or did you just make a couple brush-strokes in Photoshop and use a Perlin noise filter? If the latter, maybe you should just store the brushstrokes and the instruction to apply Perlin noise?
I thought more on the issue, and you could make a whole streaming voxel engine with possibly not enough compression, (like it would be about 8 times too big) but then you could convert it to polys + displacement maps at the end for release? That would solve the problem, anyway, even though it might not be voxels anymore unless you raycasted the displacement maps.
Ah I think ive figured out a decent compression method for voxels.

store fully the position of the first voxel, then use 6 bits per voxel to describe if it has a neighbour or not, then you can fill over the surface and catch every voxel and store them all 6 bits per voxel, maybe even less if you think about it.
theres some stuff on voxels here worth a look.
http://docs.google.c...sm-in-games.pdf

basicly, you do the neighbour/child style storing of the voxels, then you do huffman encoding, and thatll get you to about 3 bits a voxel (positional data) i think, and thats more like it!

My prediction is, if you had a 65536x65536 mountainside, itll take about 2 gigs to store colours and everything... so its still not cheap, but its at least a managable size and quite a large world. John Olick says he can beat this by double though, so hed say 1 gig.
>>3 bits a voxel (positional data)
I think more difficult is to compress the color data down to that. First step, before using jpg like algorithms, could be to use the YCbCr color space and to store the Y values in the child nodes while CbCr is stored in the parent. (http://en.wikipedia.org/wiki/YCbCr) In efficient octrees, a small color palette, like 4 colors for e.g. 8x8x8 pixels was used as I remember.

I found another big problem with streamed voxel data (also polygon data) is: Its not sufficient just to load the visible voxels. Quite a lot needs to be uncomressed in main memory for:
* collision detection: if other multiplayers/bots hit something or make a shot, all voxels for this have to be streamed in for full accuracy
* shadowing: if there are several lightsources, voxels visible from the light frustrums need to be streamed in
Articles I've read years ago used a form of wavelet compression to store voxels as a field equation something like how spherical harmonic lighting works and since they were wavelet based were progressive scaling. So you can solve to which ever level of detail you desired. Maybe I'm remembering wrong, but a quick search of the state of the art and it seems people are storing voxels in higher dimensional spaces using exotic geometric constructs like 4D polytopes..Maybe some sort of fractal compression or simple dictionary based compression could work here? Preprocess your voxelized data find patterned 3d information store this as a dictionary on the GPU and you can save bandwidth transferring 3D data over..The GPU can recompose the true 3D surface on the fly.. Are you making a Minecraft clone by any chance ? :)

Good Luck!

-ddn

This topic is closed to new replies.

Advertisement