Quote:Original post by TayZrain
...consider the possibility of letting one core handle the graphical interpretations of what the game has stored in memory, while letting another core handle the act of ruining that world.
This approach (while better than single-threaded) is still not scalable. You're designing for a fixed number of cores instead of for any number of cores.
Quote:Original post by Kylotan
I hope you don't mind me questioning a few things about this system, Hodgman.
Not at all. Discussing my thoughts out loud helps me solidify them, making it easier to explain to other people later ;) (
I guess that makes you my explanation-lab-rat, sorry!)
Quote:Why the arbitrary number of states? Just for the FPS networking model?
I'm writing this as a reusable collection of classes in my engine. I want to be able to use if for different kinds of games, so I'm using policy-based templates to control how states are stored. I.e. it's arbitrary at compile time, and can be arbitrary or fixed at runtime.
There's various (and often unforeseen) reasons that you would require a different fixed or variable number of states, so I'm trying to design my classes with this in mind.
For example: In some cases only two states are required for the update cycle to run (the older of the two would simply be recycled each update).
In this case if the renderer and the update cycle were to run at different frequencies then a third state object would be required, because for the period of time that the renderer is running the state objects that the renderer was told to use are essentially under a read-only lock and are unable to be (safely) recycled by the update loop.
Quote:Quote:As well as maintaining states, each entity maintains two pending function-call queues (named Read/Write). Any non-const functions calls are not immediately executed, and are enqueued in the 'W' queue to be executed on the next update cycle (at which time a new W state is created for them to be executed on). Const functions on the other hand are executed immediately on the latest R state.
The first problem I see here is that there are many operations that require reads of other entities in order to write to another entity or entities. This doesn't fit a clean read/write divide - a write is dependent on one or more reads, often to be executed sequentially. How does your system handle it?
Reads are done on the previously completed state objects, while writes are done on new thread-private state objects.
R and W probably weren't the best names for the func-call queues... Let's just call them A and B. Also, over the weekend I've implemented wodinoneeye's suggestion for removing contention from these queues. So let me explain again:
There are 2 groups of function-call queues, A and B. Each group contains one queue per thread, so there is no need to lock these queues. At any one time, one of these groups will be public (previously: W) and one will be private (previously: R). While public, a queue group can be filled in by any entity/thread (each thread pushes into it's own queue within the group only!). While private, a queue group is only accessible by it's parent entity and by the thread that is responsible for updating that entity.
At the prepare-commit stage, (
serial stage) all entities flip A and B. If A was previously public, then it may contain pending function calls. If A was previously private, it will be empty (
because it would have been consumed during the previous commit).
So now the public queue will be empty (
ready to collect changes generated by the coming commit stage) and the private queue will contain all of the changes requested on the previous commit stage.
Also, if a new thread-private state object was created on the previous commit, then this state object is made public as the current read-state of the entity.
During the commit stage, (
parallel stage, up to one thread per entity) each entity will generate a new thread-private state object and iterate through it's private queue group, executing all of the function calls contained within on the new private state object.
This in turn will fill up the public queue groups of itself/other entities.
If entity X attempts to read entity Y (i.e. call a const function on it), that function call is executed on the last publicly known state object associated with entity Y.
If entity X attempts to modify entity Y (i.e. call a function on it), that function call is put into entity Y's public queue, to be executed *next* commit stage.
Quote:Quote:At the pre-update stage (a serial stage), each entity that has been modified (i.e. has data in it's 'W' queue) is told to swap over it's read/write queues (i.e. the full W queue becomes the new R, and the empty R becomes the new W).
I don't understand why you are doing this. Isn't data in the W queue evidence that it Will be modified, rather than it Has been modified? Why are non-const functions now on what you call the Read queue? I'm afraid I really don't understand this part.
Yes it is evidence that it is about to be modified. I probably worded it that way because internally I'm using an 'isModified' variable to describe this...
Non-const never go on any queue - The badly-named
Read queue is just a thread-private area of memory, not a separate queue for read-functions.
While the public (W) queue group is filled by
n threads, the private (R) queue is consumed by a single thread.
At pre-commit the public (W) queue contains all requests made during the previous commit. I want to execute this during the next commit, but the next commit will also re-fill the public (W) queue.
Therefore, before the commit stage, I move the contents of the public (W) queue into a private (R) area (making the public queue empty).
Quote:Are you saying that the 'R' queue isn't really an 'R' queue at this point? And how can it query other entities - which state does it access? And isn't 2 queues per entity potentially quite a bit of overhead? I assume these are lockless...
Yeah, I hope I've explained it better this time. R is just the private area where W is moved to before W needs to be used again.
On my quad-core, its now at 8 queues per entity! Two groups (A/B) of one queue per thread. Because each thread has it's own queue, the queues don't require locks, and don't even have to be fancy lock-free ones. You can use a std::deque and the algorithm is wait-free and thread-safe.
Quote:Where does rendering fit into this?
After this whole update cycle runs, the game either runs it again (
if it's not time to render yet, or the renderer is still running), or it starts the renderer (
it must pass the current frame-id token to the renderer so the appropriate state objects can be accessed) and then runs again.
Quote:Original post by KylotanQuote:Original post by Antheus
Quote:Question 2: Doesn't this dictate that the whole system is limited by the speed of the slowest/most busy entity?
In this way it's superior to single-threaded model, where update speed is limited by sum of all entities.
Sure, but the worst-case complexity is about the same, and I'd assume that entities would vary quite widely in this regard. I suppose it wouldn't matter too much if entities were distributed across processors in a reasonable matter, but any sort of system which has to synchronise all entities that regularly is going to be less efficient than a system which synchronises less often.
I don't think it's a problem that the worst-case scenario is just as slow as traditional methods. Most games are going to have hundreds of small entities, not 1 giant one, and in these average cases then this threading system will probably perform much better.
There is no real entity synchronisation though, except for the pre-update stage which is must be done serially. The commit stage that follows is entirely parallel with no synchronisation used at all.
Quote:Original post by JPatrick
Implementing this is still a bit tricky though, because detecting request collisions is subtley complex. Let's say we have requests A, B, and C again. A is non-commutative with B, but is commutative with C. Request B however is non-commutative with both A and C. If A and B are in the queue first, we'd detect their incompatibility and nuke both of them, leaving C which is fine by itself. However, if C and B were first, they'd both be nuked, leaving A. This requires a recursive walking of the queue to detect the full chain of non-commutativeness, which strikes me as inefficient, but then again, profiling would be required to be sure, and pathological cases will exist for any system.
Hmm... I hadn't actually though about the situation where A and B are non-commutative; I had only thought about where A1 conflicts with A2, or B1 with B2.
Would it be possible to write code so that all A/B conflicts can be reduced to an A1/A2 conflict instead?
It would be much easier to manage if each function-call-type only had to be checked against others of the same type, and not against function-calls of all types.
For example:
A = TakeFireDamage( Player& from, int damage );
B = TakeColdDamage( Player& from, int damage );
In this case, A and B can conflict, but if we re-write it like this:
A = TakeFireDamage( Player& from, int dmg )
-> calls C( from, dmg*getFireResist() );
B = TakeColdDamage( Player& from, int dmg )
-> calls C( from, dmg*getColdResist() );
C = TakeDamage( Player& from, int damage );
Now A and B no longer conflict, and only C can conflict with itself.
Can you come up with any examples that can't be reduced down like this?
Quote:I stil think cloning can be avoided, even with more than 2 stored states by using a rolling buffers approach. Let's say you wanted 30 back-states stored. You could have a circular list of 30 state buffers. Let's say buffer 0 is the initial game state, making buffer 1 the current "write" state. Once that loop is finished, buffer 2 becomes the new write state, leaving buffer 1 as the read state, and buffer 0 still intact as a previous state.
Yeah when I said clone, I didn't mean to imply a certain storage type behind the scenes - I'd still call your ring-buffer "cloning", because each time the write-cursor is incremented, don't you still have to copy buffer[oldWrite] into buffer[newWrite]?
[edit]Oh I see, if buffer only stores states for a single variable, then no "cloning" is needed, because a new state is only created when writing to the buffer.
In my system, where I buffer a whole state-object, I need to "clone" it first in order to ensure that un-modified fields are propagated through the buffer..[/edit]
Quote:Each potentially mutable value of an entity could have its own buffer ring, each entry of which could be marked with the game-loop timestamp of when that value was written, which would allow you to figure out what value was in effect, say, 17 loops ago, and prevents needless cloning of unchanged values just to fill the next buffer.
I think tracking changes for each individual value will cause more overhead than the unnecessary copies caused by tracking changes for a whole object. This is just a guess though, serious investigation and profiling is needed here ;)
[edit2] Sorry for hijacking this thread with my long posts!
[Edited by - Hodgman on June 1, 2008 9:23:34 PM]