Upon request: My design.
Sorry about the big brick of text. And a quick warning: I tend to confuse myself a bit when I type single posts that are this huge. If anything seems kind of strange, that is why...
My design is split up into two components with respect to games. The main reason for this is as I described regarding my design phylosophy and task parallelism: it is better to design a single communication component first, and build everything else ontop of that. So for that reason, my design consisted of a communication framework, and ontop of that is built all the game specific stuff.
COMMUNICATION FRAMEWORK [nothing game specific]:
My communication framework consists of 4 main objects/interfaces/things.
Communication Port: a communication port is the exit point for update data to be sent to other objects. There is nothing that inherits from communication port. Each communication port relates to zero or more communication channels. Updates are published from the thread holding the communication PORT, not the communication channel.
Communication Channel: this is the receiving side of updates, and each communication channel relates to zero or more communication ports. There are lots of different kinds of communication channel for dealing with the different types of updates that will be discussed later.
Task: This is a job that must be done. They are expressed as three stages, and are treated in a way very similar to pure functions. They receive input from communication channels, compute the next step based on internal data and received input, and then publish their updates through their communication ports, in that explicit order, each interval. They do NOT access any outside data in other objects, and maintain internally all state that is required to operate effectively [yes, this means a lot of redundancy of data]. Due to this explicit ordering, this means that all updates that are sent out are the result of inputs that were completely received by the time the updates were published. For this reason, there are no partial updates, and there is never an instance in which an update is published to one task with data that is different from the data that is published to a different task. Updates are published once per iteration, and all updates are published in ONE call of 'send' per communication port [relevance discussed later].
Task Scheduler: A task scheduler is an abstract interface that is initialized with either a set of one or more tasks, or a set of one or more task schedulers. A task scheduler effectively doesn't do any of the actual work, but is mostly used to provide a tagging mechanism that I can use to coordinate communication between tasks. The real importance of a task scheduler is to determine the type of communication channel that will be instantiated between the various tasks based on a set of rules so as to maximize performance. There are many types of task schedulers. Some schedulers are sequential, in that they process contained tasks from start to finish. Others are lock-step, in that they process their tasks in parallel, but do so at a uniform frequency. Some are signaled schedulers that wait for communication, some run without a fixed frequency, and so on.
One specific type of task scheduler is called the 'master' task scheduler, and contains all the others. This acts as the entry point for creation, and has functions that initialize the tree of other task schedulers, and implements the various rules by which channels are created. It also presents all the different tasks to eachother, so that they can request appropriate communication channel creation. The master task scheduler analyzes the logical topology of tasks [as specified by the structure of contained schedulers] and works out the types of synchronization support that are needed at the various channels. The master task scheduler doesn't provide any restrictions on the speed at which it's immediate tasks run
An example of the rules that I was refering to are as follows :
If Task A and Task B are both in the same sequential scheduler [in that they are run in one thread, A-B-A-B-A-B], and B requests communication from A: Create a channel that does not provide any buffering of updates [relying on the characteristic that updates are posted at the end of a task, and thus an update will not become invalid at least until the start of the task on the next iteration].
If Task A and Task B are run in lock step, then provide buffering, but do not create a channel that handles multiple iterations of A for each of B.
If Task A and Task B are run in different threads, NOT in lockstep [cycle on different frequencies], then provide a communication channel that implements an 'update merge', that squeezes an update together with a previous update, to make one big update. The purpose of this is to not allow for fast tasks to flood the memory with updates that are queued up for slow tasks [very important. Actually, without this detail, the physics task in my current project is capable of maxing my memory with updates waiting to be processed by the long-term AI strategic planning task].
If Task B is fed only by Task A, then use a channel that doesn't perform any synchronization.
.... and other rules here for other special cases
When in doubt, act conservatively: Instantiate a buffered channel w/ merged update and full synchronization.
The end result is a sort of tree structure of schedulers as tree nodes and tasks as leafs. For example, a typical render split engine with asynchronous file IO and pathfinding might look something like this:
MasterScheduler+SignaledScheduler|+FileLoader+SignaledScheduler|+Pathfinding+SequentialScheduler|+LongTermPlanning+LockstepScheduler|+SequentialScheduler||+User updates||+GameLogic||+Steering||+Physics|+SequentialScheduler||+Render
And would look pretty much exactly like that in the code as well [only with a lot of 'new this' and 'new that' statements]
After the master scheduler is created, it has 'initializeCommunicationChannels' called, that asks each component who it wants to talk to, and establishes communication accordingly. The benefit of this is that if you find the physics step takes too long in the above example, and that you can take the latency introduced by making it it's own task, you can break it off, and the only change to your code is changing:
... [stuff] ...new LockstepScheduler( new SequentialScheduler( new UserUpdate(), new GameLogic(), new Steering(), new Physics()), new SequentialScheduler(new Render()))... [stuff] ...
to:
... [stuff] ...new LockstepScheduler( new SequentialScheduler( new UserUpdate(), new GameLogic(), new Steering()), new SequentialScheduler(new Physics()), new SequentialScheduler(new Render()))... [stuff] ...
Oooooh-kay, so communication is handled and spoken for. So then it goes into the game stuff.... which isn't quite as relevant to this discussion.