Seeking Advice for Scripting Objects with Data Oriented Design
Most scripting languages and DoD don't mix really well. The typical thing to do when you start hitting performance problems is to just move the code out of scripts and into native code. You have much more control over how you iterate over data there.
Note that part of your problem may just be the calls from Lua back into C++ and vice versa. Using the `std::map` can be a lot less of a performance problem than the chain of C/C++ calls that go into the process of Lua calling a bound method, the C++ code getting invoked, the result being marshalled into the Lua stack, and then the native code returning back down the call chain into the Lua interpreter. A common fix is to use more bulk methods. Have one native method that grabs _all_ the info the script needs rather than a bunch of individual calls. You can do even better by having the C++ code collect all that data before even invoking the script so that there's a single C++-to-Lua transition at the start of the logic update and nothing else.
Lua is not an optimal language (nor are most other off-the-shelf scripting languages). Even when you get rid of that C++ `std::map`, the resulting data is likely going to be stored in a Lua table which is a key-value hash table and still not nearly efficient to access as a bunch of POD structs in an array in C/C++. Getting a language like Lua (or Java or Python or so on) to be nicely data-oriented often require really mangling your code into very non-idiomatic forms to use nothing but low-level primitives like ints and arrays (and no objects, or non-array tables in Lua).
You're just better off doing it in C++. Collect all your data upfront into a nicely packed contiguous array of structs. Iterate over it and process it.
For the stuff like your rabbits and wolves, there's no reason I can think of off the top of my head that you can't have a flat_multimap (or something similar) that maps all wolves to the rabbits they care about. This can be updated as entities move so static entities or those which don't move far (and don't need to trigger range checks) have no additional runtime overhead imposed on this structure. When things do move enough to need a change, you're likely best off recording the changes necessary into another ordered buffer, and then applying all the changes in bulk in one pass. So long as changes are relatively rare it means you're only manipulating (inserting/removing) elements from one small array frequently; you can use a "double buffer" on your main data to make alterations on it faster.
for each rabbit in rabbits:
move()
if moved enough:
for each wolf it was close to:
changes.insert(remove, wolf, rabbit)
for each wolf now nearby:
changes.insert(add, wolf, rabbit)
for each wolf in wolves:
same stuff as above
merge_changes_into(wolves_rabbits_current, changes, wolves_rabbits_next)
swap(wolves_rabbits_current, wolves_rabbits_next)
for each wolf, rabbit in wolves_rabbits_current
do game logic
The `merge_changes_into` can be a little complex to grok at first but it's a basic concept. Keep an iterator pointing to both the input flat map and the changes flat map. For each input, if it's equal to or greater than the next change, apply the change (either insert the change into the output flat map or skip over the input element and output nothing), otherwise just write the input to the output. Keep both the input and output around (just swap them every time) to avoid the need to reallocate a new memory buffer each frame; this is a double-buffering technique.
Something like the following totally untested algorithm:
merge (input, changes, output):
curInput = input.begin()
curChange = changes.begin()
while curInput is not at end
while curChange.pair <= curInput:
if curChange.type == add:
# there's something new to insert at or before this input item
output.append(curChange.pair)
curChange.next()
if curInput == curChange.pair and curChange.type == remove:
# we need to remove the current item, so just don't add it to the output
curChange.next()
curInput.next()
else:
# the current item has not changed, copy to output
output.append(curInput)
curInput.next()
# add the remaining changes that sort after our current input
while curChange:
if curChange.type == add:
output.append(curChange.pair)
You might get better performance with two changes array; one for adds and one for removes. You'd have to try it to know for sure, but it would reduce the size of the change data (more data per cache line) and remove some of the branches in those loops.
The final trick then is keeping your wolves and rabbits sorted. You don't need to search the whole wolves_to_rabbits array; you can skip all the parts in it that you've already used when processing wolves earlier in your wolf AI loop. Depending on how many holes the wolves_to_rabbits array is likely to have, you might even be better off with linear searches through it than a binary search, but you'd have to try it and profile the results to be sure.