Seeking Advice for Scripting Objects with Data Oriented Design

Started by
0 comments, last by SeanMiddleditch 10 years, 2 months ago
Ordinarily, when performing game logic on instances of a generic Entity class, after spacial partitioning and assembling a list of nearby, relevant entities, I would simply toss the list of entities and the relevant entities into a script (usually lua) and do something like: [source] --Below: A wolf hunts for rabbits to eat TYPES = { WOLF=0,RABBIT=1} ... function UpdateEntities(entities_to_update,relevant_entities) for entity in entities_to_update do if GetEntityType(entity)==TYPES.WOLF then DoWolfAi(entity,relevant_entities); end end end ACTIONS = {HUNT=0,FLEE=1,PRAY=1029}; function DoWolfAi(entity,relevant_entities) --Below: look for food local actions = {}; for other in relevant_entities do if other~=entity then if GetEntityType(other)==TYPES.RABBIT then table.insert(a,{action=ACTIONS.HUNT,target=other,priority=GetEntityHunger(entity)*GetDistanceBetween(entity,other)}); end ... end end --Ommited: sort actions by member.priority --Ommited: pick top action (hunt) and do hunt function end [/source] It's pretty straightforward, and works just fine for a small number of instances. In fact, it's served me well for the admitedly hackish projects I've thrown together. But, I'm moving on to larger projects and running into issues of scalability. After getting a massive wakeup call to DoD (Data Oriented Design) and things like cache misses (which didn't matter so much when it was 50 or so objects with simple scripts, but really matter when it's 100k complex objects), I've come to realize that my old way of doing things is grossly insufficient. The way this handles data seems attrocious, with what I know now. First there's all those "GetBlahBlah(entity)" calls, which usually refer back to an map (std::map or boost::container::flat_map) to fetch data from a generic Entity class. Each one is a lot of overhead, because it has to look up the map to find the Entity, then find Entity.BlahBlah, then pass it all the way back to the script. I shudder to think what's happening to the cache during all this. That wont work for hundreds or worse thousands of objects. It's also hard to envision a model that works in a cache-friendly manner. Each "wolf" entity needs to fetch data from every nearby "rabbit" entity (namely its location and type), and thus it's dependent on other instances. I also wont know which wolf will need which rabbit when the wolf and rabbit are allocated, so even though flat_map has contiguous memory, I'm forced to use random access rather than sequential (read: cache friendly). My current project has a bit of a challenge mixed in: It has to run decently on a specific machine with an Intel Atom N570 (read: crap). 4 cores, 1.42ghz, ancient technology. That means I have to dance carefully. So, are there any examples of a DoD (Data Oriented Design) approach to the same issue? Every book, every paper, every article I've read uses something more complex but fundamentally similar to the above. From gamedev to gamasutra, very little seems to use Data Oriented Design. I could really use some reference material, or just informative discussion. Thank you. Note: this forum (and only this forum) likes to erase whitspaces and newlines. Not sure why, and it seems unique to me. If it's illegible I appologize, I'll try to edit and fix it but the edits often times erase more...
Advertisement

Most scripting languages and DoD don't mix really well. The typical thing to do when you start hitting performance problems is to just move the code out of scripts and into native code. You have much more control over how you iterate over data there.

Note that part of your problem may just be the calls from Lua back into C++ and vice versa. Using the `std::map` can be a lot less of a performance problem than the chain of C/C++ calls that go into the process of Lua calling a bound method, the C++ code getting invoked, the result being marshalled into the Lua stack, and then the native code returning back down the call chain into the Lua interpreter. A common fix is to use more bulk methods. Have one native method that grabs _all_ the info the script needs rather than a bunch of individual calls. You can do even better by having the C++ code collect all that data before even invoking the script so that there's a single C++-to-Lua transition at the start of the logic update and nothing else.

Lua is not an optimal language (nor are most other off-the-shelf scripting languages). Even when you get rid of that C++ `std::map`, the resulting data is likely going to be stored in a Lua table which is a key-value hash table and still not nearly efficient to access as a bunch of POD structs in an array in C/C++. Getting a language like Lua (or Java or Python or so on) to be nicely data-oriented often require really mangling your code into very non-idiomatic forms to use nothing but low-level primitives like ints and arrays (and no objects, or non-array tables in Lua).

You're just better off doing it in C++. Collect all your data upfront into a nicely packed contiguous array of structs. Iterate over it and process it.

For the stuff like your rabbits and wolves, there's no reason I can think of off the top of my head that you can't have a flat_multimap (or something similar) that maps all wolves to the rabbits they care about. This can be updated as entities move so static entities or those which don't move far (and don't need to trigger range checks) have no additional runtime overhead imposed on this structure. When things do move enough to need a change, you're likely best off recording the changes necessary into another ordered buffer, and then applying all the changes in bulk in one pass. So long as changes are relatively rare it means you're only manipulating (inserting/removing) elements from one small array frequently; you can use a "double buffer" on your main data to make alterations on it faster.


for each rabbit in rabbits:
  move()
  if moved enough:
    for each wolf it was close to:
      changes.insert(remove, wolf, rabbit)
    for each wolf now nearby:
      changes.insert(add, wolf, rabbit)
for each wolf in wolves:
  same stuff as above

merge_changes_into(wolves_rabbits_current, changes, wolves_rabbits_next)
swap(wolves_rabbits_current, wolves_rabbits_next)

for each wolf, rabbit in wolves_rabbits_current
  do game logic

The `merge_changes_into` can be a little complex to grok at first but it's a basic concept. Keep an iterator pointing to both the input flat map and the changes flat map. For each input, if it's equal to or greater than the next change, apply the change (either insert the change into the output flat map or skip over the input element and output nothing), otherwise just write the input to the output. Keep both the input and output around (just swap them every time) to avoid the need to reallocate a new memory buffer each frame; this is a double-buffering technique.

Something like the following totally untested algorithm:


merge (input, changes, output):
  curInput = input.begin()
  curChange = changes.begin()

  while curInput is not at end
    while curChange.pair <= curInput:
      if curChange.type == add:
        # there's something new to insert at or before this input item
        output.append(curChange.pair)
        curChange.next()
    if curInput == curChange.pair and curChange.type == remove:
      # we need to remove the current item, so just don't add it to the output
      curChange.next()
      curInput.next()
    else:
      # the current item has not changed, copy to output
      output.append(curInput)
      curInput.next()
    
  # add the remaining changes that sort after our current input  
  while curChange:
    if curChange.type == add:
      output.append(curChange.pair)

You might get better performance with two changes array; one for adds and one for removes. You'd have to try it to know for sure, but it would reduce the size of the change data (more data per cache line) and remove some of the branches in those loops.

The final trick then is keeping your wolves and rabbits sorted. You don't need to search the whole wolves_to_rabbits array; you can skip all the parts in it that you've already used when processing wolves earlier in your wolf AI loop. Depending on how many holes the wolves_to_rabbits array is likely to have, you might even be better off with linear searches through it than a binary search, but you'd have to try it and profile the results to be sure.

Sean Middleditch – Game Systems Engineer – Join my team!

This topic is closed to new replies.

Advertisement