You are not quite off-track, but it is probably a little too complicated. The idea with calculating the index is pretty good, but I'd advice you to rather have a look at how my implementation handles this. Recursion is inevadable, if you wan't your tree to have optimizations like "don't add a node to the 5th depth, if even the first child node doesn't have any other objects" (in this case you would want to insert the node in the 1st depth, of course).
//acccess volume and octree max size
float volumeSize = volume.MaxExtents(), treeSize = m_size;
int depth = 0;
//until node box is smaller than the volume
while(treeSize > volumeSize)
{
treeSize /= 2;
depth += 1;
//break on max depth
if(depth > Node<T>::MAX_DEPTH)
break;
}
//adjust depth and size
depth -= 1;
treeSize *= 2;
This calculates the desired depth. We have the roots size, so by deviding this size by 2 repeadately, we can determine the maximum depth the child may go. But as it would be contra-productive to, as said before, insert a tiny node into the 6th depth if e.g. the fourth parent of that node doesn't even have any nodes (nor their children), we recursively determine the indizes for x, y and z - just instead only for the final depth, for each depth from the top down:
for(unsigned int i = 0; i != depth; i++)
{
//pick node if it isn't split
if(!pNode->IsSplit())
break;
unsigned int currentDepth = depth - i;
//calculate position at current depth
//todo: check for invalid positions
unsigned int currentDepthX = x >> currentDepth;
unsigned int currentDepthY = y >> currentDepth;
unsigned int currentDepthZ = z >> currentDepth;
//calculate childs index
unsigned int childIndex = currentDepthX + (currentDepthZ << 1) + (currentDepthY << 2);
//get new node
pNode = pNode->Children()[childIndex];
//modify absolut positions to new child
x -= currentDepthX << currentDepth;
y -= currentDepthY << currentDepth;
z -= currentDepthZ << currentDepth;
}
Note how this is almost brancheless - and we don't need to compare any boxes etc.. at all. Now there where only two possible ways of optimization that would come from your idea (as far as I can see):
- if you drop the " pNode->IsSplit() " - check, you might indeed save some work, namely the whole recursive calculation - you can just calculate the deepest index right away. But this comes at the cost of very deep tree hierachies, specially for branches with very few, small objects. This will potentially lead to a far worse culling time - since you need to do many redundand checks for a loose octree anyway, this will quickly outweight the slighly increased insert time.
- Otherwise, such a singe array would only save the "pNode->Children()" call. The compiler will almost certainly inline this, so this doesn't matter anyway. If you insert a whole lot of objects in a row, this might actually be better for the cache - but thats just assumptions.
That being said, I think if you have an implementation similar to mine, you aren't already too off-track. Think about it. 3 ms for insertion one million objects, increasing only linearly - unless you want to insert 10 million objects per second at 60+ FPS, this should be pretty sufficient.