# How to make a 1Demensional Array Loose Octree?

## Recommended Posts

wizardpc    282

So My current octree is node-pointer based, where each node has 8 child nodes.  I was reading that you can achieve O(1) insertion/deletion by using a 1Demensional Array.  Unless I'm wrong, the 1D array is not optimal for searching the tree when frustum culling?  So I was thinking of having both, where I use the 1D array for insert/delete objects, and the parent/child nodes when traversing the tree.

I've been trying to think how to map my parent/child data structure to a more flat 1Demensional array.  I wanted to know if anyone had done this and can give me any tips/pointers.

I was thinking maybe making an array for each depth?

My Children nodes are in a 3D array

mChild[2][2][2]

##### Share on other sites
Juliean    7068

Given through your title that you already have a "Loose Octree", insertion time should already be pretty damn fast. I'd be interested in this technique here too, but there shouldn't be that much gain, if you really have a loose octree, since your method will look pretty much like this:

   //call to insert node
void InsertNode(BoundingVolume& volume, T data){
//acccess volume and octree max size
float volumeSize = volume.MaxExtents(), treeSize = m_size;
int depth = 0;

//until node box is smaller than the volume
while(treeSize > volumeSize)
{
treeSize /= 2;
depth += 1;
//break on max depth
if(depth > Node<T>::MAX_DEPTH)
break;
}
depth -= 1;
treeSize *= 2;

//find node
Node<T>* pNode = FindFittingNode(volume, treeSize, depth);
//insert data into node
pNode->InsertData(volume, data);

//link volume and node for update
m_mNodeData[&volume] = pNode;
};

Node<T>* FindFittingNode(const BoundingVolume& volume, float treeSize, unsigned int depth)
{
D3DXVECTOR3 vCenter(volume.m_vCenter.x + m_size, volume.m_vCenter.y + m_size, volume.m_vCenter.z + m_size);
//calculate absolut position
unsigned int x = (unsigned int)(vCenter.x / treeSize);
unsigned int y = (unsigned int)(vCenter.y / treeSize);
unsigned int z = (unsigned int)(vCenter.z / treeSize);
Node<T>* pNode = &m_node;
//iteratively choose next child node
for(unsigned int i = 0; i != depth; i++)
{
//pick node if it isn't split
if(!pNode->IsSplit())
break;

unsigned int currentDepth = depth - i;
//calculate position at current depth
//todo: check for invalid positions
unsigned int currentDepthX = x >> currentDepth;
unsigned int currentDepthY = y >> currentDepth;
unsigned int currentDepthZ = z >> currentDepth;
//calculate childs index
unsigned int childIndex = currentDepthX + (currentDepthZ << 1) + (currentDepthY << 2);
//get new node
pNode = pNode->Children()[childIndex];
//modify absolut positions to new child
x -= currentDepthX << currentDepth;
y -= currentDepthY << currentDepth;
z -= currentDepthZ << currentDepth;
}

return pNode;
}


Now tell me where you'd think that this one-linear array might speed things up? Maybe there is a completely different implementation, but for mine, there wouldn't be too much gain (given that it is already fast, 3 ms for 1.000.000 object insertion, and there is probably place for optimization).

Edited by Juliean

##### Share on other sites
wizardpc    282

This is my Current way of building the Octree.  The actual function takes a depth parameter and recursively calls it until the desired depth is reached.

psudeocode:

Function BuildNodes() {
for x
for y
for z
mChild[x][y][z] BuildNodes
}


So then I was thinking If I Indexed all the nodes at a certain depth layer by iterating z, then y, then x.  So node[0] is front-bottom-left, and node[n] is back-top-right.  The volume would index by iterating z, then Y, so the entire left side, and then repeating that as you go right (x axis).  I was thinking given the depth level I can calculate offsets I need to figure out where my node should be placed in this array.  I will be storing pointers basically for fast node retrieval on insert/delete operations.

m = 2 ^ depth
xOffset = m^2
yOffset =  m
z = 0  //No offset needed for z


So, Using width, CenterPosition, and Depth, I can calculate these values:
xIndex
yIndex
zIndex

So then I could update the array and point to the correct node.
depthNodes[ xIndex * xOffset + yIndex * yOffset + zIndex]  = this;  // This being the child node when BuildingNodes()

And that was for the sake of this question.  I would probably make depthNodes a 2D array where the actual call would look like this
depthNodes [depth] [ xIndex * xOffset + yIndex * yOffset + zIndex]  = this;

So would this work? am I over-thinking it? critiques?

##### Share on other sites
Juliean    7068

You are not quite off-track, but it is probably a little too complicated. The idea with calculating the index is pretty good, but I'd advice you to rather have a look at how my implementation handles this. Recursion is inevadable, if you wan't your tree to have optimizations like "don't add a node to the 5th depth, if even the first child node doesn't have any other objects" (in this case you would want to insert the node in the 1st depth, of course).

        //acccess volume and octree max size
float volumeSize = volume.MaxExtents(), treeSize = m_size;
int depth = 0;

//until node box is smaller than the volume
while(treeSize > volumeSize)
{
treeSize /= 2;
depth += 1;
//break on max depth
if(depth > Node<T>::MAX_DEPTH)
break;
}
depth -= 1;
treeSize *= 2;


This calculates the desired depth. We have the roots size, so by deviding this size by 2 repeadately, we can determine the maximum depth the child may go. But as it would be contra-productive to, as said before, insert a tiny node into the 6th depth if e.g. the fourth parent of that node doesn't even have any nodes (nor their children), we recursively determine the indizes for x, y and z - just instead only for the final depth, for each depth from the top down:

		for(unsigned int i = 0; i != depth; i++)
{
//pick node if it isn't split
if(!pNode->IsSplit())
break;

unsigned int currentDepth = depth - i;
//calculate position at current depth
//todo: check for invalid positions
unsigned int currentDepthX = x >> currentDepth;
unsigned int currentDepthY = y >> currentDepth;
unsigned int currentDepthZ = z >> currentDepth;
//calculate childs index
unsigned int childIndex = currentDepthX + (currentDepthZ << 1) + (currentDepthY << 2);
//get new node
pNode = pNode->Children()[childIndex];
//modify absolut positions to new child
x -= currentDepthX << currentDepth;
y -= currentDepthY << currentDepth;
z -= currentDepthZ << currentDepth;
}


Note how this is almost brancheless - and we don't need to compare any boxes etc.. at all. Now there where only two possible ways of optimization that would come from your idea (as far as I can see):

- if you drop the " pNode->IsSplit() " - check, you might indeed save some work, namely the whole recursive calculation - you can just calculate the deepest index right away. But this comes at the cost of very deep tree hierachies, specially for branches with very few, small objects. This will potentially lead to a far worse culling time - since you need to do many redundand checks for a loose octree anyway, this will quickly outweight the slighly increased insert time.

- Otherwise, such a singe array would only save the "pNode->Children()" call. The compiler will almost certainly inline this, so this doesn't matter anyway. If you insert a whole lot of objects in a row, this might actually be better for the cache - but thats just assumptions.

That being said, I think if you have an implementation similar to mine, you aren't already too off-track. Think about it. 3 ms for insertion one million objects, increasing only linearly - unless you want to insert 10 million objects per second at 60+ FPS, this should be pretty sufficient.

Edited by Juliean

##### Share on other sites
wizardpc    282

Juliean,

This is my First Octree, obviously, and also my first attempt at a scene manager.

Do you remake your loose octree each frame? or only for objects that have moved?

##### Share on other sites
Juliean    7068

Do you remake your loose octree each frame? or only for objects that have moved?

No, while I'm also looking still for a better solution for this, I currently notify the octree via a specifiy component in my Entity/Component system when an object has moved, and from there on, I remove the object entirely from the tree and insert it back again. As I said, I'm pretty sure there is a better solution to this, but it there are more important things for me to do right now - and as previously, there is already space for an enourmous number of objects to being moved per frame. and its way better than any none-loose octree!

Remaking the octree every frame is generally a bad idea. With the loose octree, we don't even have to reconstruct it when an object moved. I prefer to construct the whole octree node-wise up to the maximum depth at load time, and keep track of whether the node is split via a boolean flag. This way, there is no memory allocation/deallocation needed at runtime - which makes things even faster. Like I said before, only split a node when eigther itself has at least one object in it, or at least one of its child nodes is split. On object removal, you check that constraint - if there is no more objects are left in a node (and its child nodes), it is marked as "not split" - and later one, we don't need to check that node anyway.

Edited by Juliean

##### Share on other sites
wizardpc    282

For a tree with depth of 7

I guess I"m talking about arrays of pointers with the following sizes:

Depth 0 = 1

Depth 1 = 4

Depth 2 = 64

Depth 3 = 256

Depth 4 = 1024

Depth 5 = 4096

Depth 6 = 16384

Depth 7 = 65536

Is that too much memory to allocate for fast O(1) insert speed?  Also, Now I'm thinking about your "split/Non Split" idea, and I think if I were to insert using the 1D array, I could then recursvily call it's parent to keep setting "isSplit = true".  Having a flag to help speed up Frustum calls which WILL need to recursivly traverse the tree, is a good idea, so thanks for mentioning that.

Also, thanks for the responses.

##### Share on other sites
Juliean    7068

For a tree with depth of 7

I guess I"m talking about arrays of pointers with the following sizes:

Since an octree has 8 nodes each, I'm not totally sure what you did there, but the formular should go 8^depth, so thats at least:

Depth 0 = 1

Depth 1 = 8

Depth 2 = 64

Depth 3 = 512

Depth 4 = 4096

Depth 5 = 32768

Depth 6 = 262144

...

Without counting the you have to reserver memory for the parent notes too, so you actually need to add all the depths before, too.

Note that even my extremely strong 16 gb, Intel Core i7 couldn't handle a depth beyond 8 properly. If you want to go the "one array" route you basically got only a few sensical choices, and if you want to go O(0), you will need actually need to allocate all depths. So if you need more than depth 7-8, you will need to go with an approach that allows creation and destruction of nodes in memory. But it shouldn't be necessary, unless you want to support extreme large scenes, etc... . I'm still not so sure how this one buffer approach will turn out since, as I already established, will give you no real means of speed in comparison to the method I implemented, it will furthermore restrain you to a certain implemention (with my method I would e.g. be able to kill "dead" nodes if I decided to use the octree on static geometry or single modules - I don't see much chances for your method, there) - but I'd be glad if you could tell us about the results after you are done.

##### Share on other sites
wizardpc    282

Opps, Yea my math was off.  I will try your method when I get home off work tonight.

##### Share on other sites
OneEyed    143

I tried building an octree using a 1D array, but as Julian said, memory problems arise with depths of 8 or more.

I'm curious as to how Julian was indexing his 1D array of octants?  My implementation was breaking a 32-bit integer into groups of 3-bits that represented each depth, which maxed out at 10 depths with about 4GB of memory minimum with no data.

Key Format:
32-bit Unsigned Integer Max Octree Size = 10 depths (8 octants per depth = 2^3 = 3 bits)

00           000 000 000 000     000
(unused) ... (0) (1) (2) (3) ... (N)

[N = largest octree, max = 10]
[0 = smallest octree]

3-bit Octree position of octants:

TOP                BOTTOM
+-----------+    +-----------+
| 011 | 001 |    | 111 | 101 |
|-----+-----|    |-----+-----|
| 010 | 000 |    | 110 | 100 |
+-----------+    +-----------+


I found that indexing whether with 1D, 2D, or 3D arrays will still have to traverse exactly like octrees with nodes or it wouldn't be an Octree.  The natural indexing method of octrees is by moving through depths.  "Loose" octree would only be viable if one could calculate the index in one go, rather than a recursive nature.   It is possible you will save 1 to 5 instructions using a 1D array for insertion, but space is just as valuable as runtime and as a standard practice you should never be maxing out the users memory.

I really wished 1D octree would of worked, but my findings have shown that I rather have a 20-30 depth octree with little memory impact!

Edited by OneEyed

##### Share on other sites
Lassini    101

it is already fast, 3 ms for 1.000.000 object insertion

Are you sure about these numbers? They look incredibly high. It's like ~12 cpu cycles per object. Can you give more details about your benchmark? It's really interesting.

##### Share on other sites
Juliean    7068

Are you sure about these numbers? They look incredibly high. It's like ~12 cpu cycles per object. Can you give more details about your benchmark? It's really interesting.

Uh, I'm not so sure, seeing what I wrote in another thread, I probably meant 30 ms, which should sound more reasonable. Here you can also find the implementation of my octree. As I said there, the "max extents" function for the AABB-class is unoptimized, this makes the insert time finally be about 20 ms, for 1 mio objects.

I'm curious as to how Julian was indexing his 1D array of octants? My implementation was breaking a 32-bit integer into groups of 3-bits that represented each depth, which maxed out at 10 depths with about 4GB of memory minimum with no data.

I didn't use the 1D-array approach at all, as you can see in the thread I linked to above, I have each node store its child nodes. 10 depths is completely unreasonable IMHO, like you said it will take about 4 GB of RAM just for the array of 8^10 entries. Such a tree would subdivide an area of 1 km size into 0,97 m sized cubes on the lowest level. You aren't likely going to need that much precision, or that large of a scene anyway. Also, with such a high depth, the tree will easily become a bottleneck, since you have to traverse all 10 levels. So you should probably don't go deeper than 6 levels.

##### Share on other sites
Lassini    101
Juliean, you're doing insertion into std::map on every InsertData() method call in your implementation. That alone will severely limit performance of your octree insertions. In fact, putting 1000000 sequential integers (from loop counter) to std::map<int, int> takes about 220 ms on my i5 2400 (both in VS2012 and MinGW, release builds under 32-bit Win 7). And putting 1000000 random integers (as keys and values, from array that is prepared beforehand) takes about 500 ms. So i'm very interested about how you got your numbers.