the main difference is that an oct-tree divides the space into 8 regions in each step, so requires less recursion, but has to do a few more checks at each step (since there are 8 possible regions to step into). dividing the space in a binary-fashion allows simpler logic, but a larger number of internal nodes.
That's not necessarily true, you can implement an "oct-tree" with a b-tree under the hood; your recursion step should cycle over each axis. Thus the number of recursive steps would be similar in complexity to a BSP-tree.
But I think my question was poorly worded. Simply put, I'm curious about if it's worth it to "choose" a splitting plane versus naively cutting it down some midpoint in the dynamic real-time situation you were describing. The obvious difference is that choosing the latter would be O(1), and the former would be (presumably) at least O(N). That of course would pay off as searching would be faster. If the tree were generated before-hand and kept static, then of course it'd be better to load the cost up front, but in the case of a real-time dynamically generated tree, I'm wondering if, generally, such an approach is still worth it.
It's a bit of a subjective question and it begs more for intuition/experience than a purely academic response, I suppose.
It's not about memory it's about how much CPU time doing a lot of ray casting takes really.
I'm wondering if there's a subtle point here that you're also describing. If you're just talking about memory allocations, then of course it's almost never an issue, but isn't memory bandwidth is a large bottleneck for speed these days? I don't work in the games industry, so I'm not aware of the current state, but isn't it a challenge to maintain cache coherency in collision detection systems, too? Or is cache coherency kind of a solved issue at this point?

Find content
Not Telling