I'm attempting to write a density voxel system using marching cubes for terrain. For large scenes, each N by N by N (N = 64 for example) of the scene should correspond to a "chunk" of voxels, and there should be a way to query each "chunk" individually (e.g. each chunk has a file on disk, or you generate a single chunk at a time).
My issue is the following: in games like MineCraft where the voxels are cubes, N by N by N voxels actually contains (and completely defines) exactly N by N by N units of space. However, in systems where a mesh is extracted, it actually takes 8 voxels (centered on corners of a cube) to define a "patch" of the mesh. This means that to completely generate the region of N by N by N voxels, (N+1) by (N+1) by (N+1) voxels are actually required. This means that if each chunk is stored/generated in isolation, there is a duplicated row (or rather, "slice") of voxels stored. Furthermore, it is nice to use a power of 2 for N since it works well with LOD data structures (e.g. chunked-LOD) or octrees. However, this means that the actual stored voxel volume won't be a power of 2, and won't have nice cache alignment properties. It also means that updating voxels isn't as simple - you might need to update the data in multiple places - potentially in up to 8 chunks if it's a corner!
Actually though, the problem doesn't stop there - to generate smooth normals for lighting, you need to examine the surrounding voxel density values, so you actually need an extra row of voxels on each edge - this means you actually need (N + 3)^3 voxels!
A few possible approaches I've thought about: 1) Just deal with the extra storage. I really don't like this solution though, it seems messy and probably slow. 2) Require that 3x3x3 chunks be present in memory before generating the center chunk. I don't really like this though because it seems very wasteful and adds weird dependencies between chunks that shouldn't be there. 3) When generating the mesh for a chunk, leave off a row - that is, only generate a mesh filling (N-1)^3 units of space. When there are two adjacent chunks present, generate the strip connecting them separately.
I think solution 3 is the best, and it also allows for LOD stitching between chunks to be handled very nicely, since if one of the chunk's LOD level changes, only the strips connecting chunks need to be regenerated, and algorithms like http://www.terathon.com/voxels/ could be easily implemented. However, it still does not solve the problem of needing an extra row of voxels for normal generation! I can think of 2 possible solutions for this: either regenerate the normals on the edges of meshes when the adjacent chunk is loaded, or make the connecting "strips" 3 units wide instead of 1 unit wide, and just leave off an extra strip from the "main" chunk meshes. Neither of these seem that good though: the first solution would require re-uploading the entire mesh, or portions of the mesh, to the GPU, and would require keeping track of which vertices need to have their normals recomputed; in the second solution, an "isolated" chunk would be smaller in size, and the size of the strips connecting chunks would be larger and take longer to generate, but this seems like the better of the two.
So, how is this usually handled? Is solution 3 the way to go? And how are normals dealt with?
In PolyVox we seperate the storage of the volume data from the way surface extraction is perfomed. That is, the volume may or may not consist of a set of blocks, and even if it does then the size of these blocks may or may not be the same as the size of the extracted meshes. For example you might decide that 32x32x32 is the ideal size for storing blocks in memory, but 64x64x64 is better for the rendered meshes. Or maybe that you want the rendered blocks to be 16x16x128 but you don't want the memory to be broken into blocks at all (perhaps you'd rather use an octree?).
So basically we focus on just providing fast volume data structures which are independant of the algorithm which is executed on them. Surface extraction is just one task you need to perform, and raycasting (for example) can have a different set of characteristics.
That doesn't exactly answer your question... but maybe it's something to think about.
I use solution 2 with fixed-size chunks of 16x16x16 voxels. Each chunk is independently stored in a compressed form, and 3x3x3 chunks are decompressed whenever the center chunk needs to be polygonalized. I don't really think there's anything weird about it. Sure, it's not an absolutely optimal use of memory, but we're only talking about 108 kB here for 27 chunks with one byte per voxel, and it gets the job done in a very practical way without having to worry about data availability at the center chunk's boundaries.
Thanks for the replies. PolyVox, that is a good idea. I am thinking about doing something similar, where I have an underlying voxel manager that works at the "chunk" level, and on top of that a voxel renderer which makes requests to load regions of voxels and allows the voxel manager to decide which chunks those are.
Eric Lengyel - you're right, 16x16x16 doesn't sound too bad; I was thinking about chunks as being larger, but perhaps that isn't optimal (though obviously that's up to profiling to decide). My idea was actually to have a mipmap/chunked-LOD system, where a single mipmap pyramid would correspond to a single chunk (my reasoning for larger chunks). But maybe smaller chunks would work with this as well (the lower detail levels would just span multiple chunks). Are you using a LOD system?