Back to General and Gameplay Programming

Object databases and data structures for concurrency

General and Gameplay Programming Programming

Started by intransigent-seal October 07, 2010 01:35 PM

4 comments, last by intransigent-seal 13 years, 6 months ago

intransigent-seal

881

Author

October 07, 2010 01:35 PM

I'm working on a computer vision application and I'm trying to put the central data structure together. This is not totally trivial, because the data structure must be designed for concurrent access.

My expected use pattern has a lot in common with requirements in games:
- A UI thread that renders the current state of the data
- A tracking thread that updates a camera pose estimate in real time (and needs read-only access to the data state; could be analogous to running AI in another thread)
- A mapping thread that performs structural updates to the data (analogous to a thread running main game logic)
- An optimisation thread that runs an expensive non-linear optimisation process on the data to refine it (analogous to running physics simulation in another thread except that unlike game physics it probably won't be running 'at framerate' -- in terms of access to the data, needs to update values in bulk, but doesn't need to change the 'structure' of the data, not adding/removing known entities or their relationships).

The three routes that I can think of for designing the management of this are:
1- A complex ad-hoc design direct from the best guess I can come up with for my requirements.
2- A simple design using a single multi-reader-single-writer lock on one shared copy of the data, and more care in the other parts of the design (eg, thread communication and buffering updates) to try to reduce contention on the lock.
3- First create a generic in-memory object database library, with snapshotting capability and probably object-level locking, and then use that.

Of the three, the first is quite undesirable (for fairly obvious reasons), I'm put off from the second because I'm concerned that I won't be able to effectively reduce lock contention without a large mess of hacks (putting it close to the first in terms of complexity and debugging time), and the third puts me off because it feels like overkill.

So, I'm looking for tips from anyone who has tackled this kind of design problem before.
Alternatively, recommendations for existing in-memory object database libraries that I could use (object databases exist, but I haven't found one that is free and for C++)

John B

The best thing about the internet is the way people with no experience or qualifications can pretend to be completely superior to other people who have no experience or qualifications.

Noggs

141

October 07, 2010 03:59 PM

Why not try and break down each of the 4 large processes into discreet steps that can work on chunks of the data. This would be the route that a lot of games use nowadays to distribute work across cores and is normally referred to as a Task based architecture.

This removes the need for locks around massive chunks of memory and allows parallelization to a much higher degree (the design you describe will max out at four cores, task architecture scales much better).

The basic idea would be that you run the high level tasks synchronously (rendering, tracking, mapping, optimisation) but each task creates a whole bunch of jobs each designed to work on a small part of the data. These jobs are picked up by worker threads running on each core. You just need to ensure each job operates independently from the other.

The sync points are then after each high level task and they simply wait for all the jobs to be completed.

intransigent-seal

881

Author

October 07, 2010 04:29 PM

Hmmm... that might work. Presumably you mean the overseer thread has the definitive copy of the data, and packs copies (of parts) of that data into task objects, so the tasks themselves don't even touch the definitive data?

That's certainly possible for optimisation (which involves building a copy of the data into a large sparse matrix form anyway), and should work ok for several of the core vision tasks like feature matching, new feature detection, pose estimation and so on.

Rendering could still be slightly tricky since the visualisation is likely to touch most of the data, but I think that could be dealt with with some care.

Sounds like at its core, this is just an inversion of control over what I had in mind: where my original plan had each task doing its thing under its own control and requiring effective encapsulation and management in the data structure to ensure correct and efficient concurrent access, this makes the access management the 'active' part and the data processing work is done by dumb servers. Sounds like that inversion is a useful trick to remember.

I shall have to think about its wider effects a bit -- more suggestions or comments are welcome.

John B

The best thing about the internet is the way people with no experience or qualifications can pretend to be completely superior to other people who have no experience or qualifications.

Antheus

2,410

October 07, 2010 05:22 PM

class Blob {  // sends change or control to be applied  void push(Object);  // updates one logical time step of this blob  // usually creates a completely new internal state  // applying changes and processing data  void process();  // a system subscribes to changes made by this blob  // Type indicates whether to receive all delta produced by process()  // or just signal that process() has completed one step  void subscribe(BlobSubscription, Type);  int getVersion();  // well...  Object * getLatestState();  Object * getVersionedState(int version);  // indicates that one subscriber has consumed specified version  // or that it has chosen to ignore everything but most recent state  // blob uses this to remove states that have been used by all systems  void acknowledge(BlobSubscription, int version, bool all)}class BlobSubscription {  // signals that Blob has gone advanced to   void notify(Blob, int version);  // delta state produced by blob  void delta(Blob, List<Object>, int version);}

This is a rough design for a data-centric fully versioned concurrent system. Systems (which may be blobs themself) subscribe to concrete data blobs. There is no need for generic types, blobs can be concrete - an array or similar.

Each blob is inherently versioned. It should keep up all necessary past versions of the state until all subscribers have consumed them.

A system that wants to maintain a local copy, possible modified or otherwise processed of another system will subscribe to deltas. Renderer will likely subscribe only to latest state. For example, if renderer completes, and it hasn't received a notification of new version, then it simply waits around for a bit doing nothing.

Alternatively, renderer could subscribe to delta changes, and maintain its own scenegraph or some spatial partition for picking or similar.

Usual gotchas.
- Logical UI response rate (not FPS, but time from intent to execution) is limited by whatever dependency exists between systems. Since input will be push()ed to some lower level Blob it will need to go through process() of all dependent systems. Using a concept of priority or separating command from data actions can minimize this delay. This can be a big deal and showstopper.

- Dependencies can be used to build a DAG with topological sorting, which is then advanced synchronously. Some threads will likely still run independently (renderer, input)
- If some system falls too far behind, the past state backlog may start accumulating. There needs to be a way for a blob or system to suspend processing if this happens.
- Each blob requires at least two buffers + delta log, so at least double memory
- Dependent blobs can be composited into new one, which then advances predictably (calls process on all contained blobs)
- There is no need for abstract types, each blob exposes state and push() in problem domain, using actual data structures.
- There are no locks as such. Internal buffer flips can be done atomically, subscriptions are not time critical, data changes are versioned, data access between blobs is inherently shared-nothing or read-only.
- This design scales with bulk data. The bigger the chunks published to push(), the better. This is no problem, inside process() all changes are accumulated for current version (completely thread-local operation), and then published once per time-step.

It's all about data. One chunk of data is transformed into another. So the key is to recognize which distinct blobs there are, and how they interact. And there is nothing saying that a blob may not be SQL database, a scenegraph or just plain array, as long as accessing its past versions is a (relatively) cheap operation.

Primary use of such design is when there is a lot of bulk processing to be done but where relatively small parts are relevant to user interaction.

Noggs

141

October 07, 2010 05:28 PM

If you want to pack the data into the jobs then great - that'll probably work well in a distributed memory multi processing system!

The overseer thread as you call it does not have to copy the data though. You could have the data in memory used as read-only input to all the jobs, and use a second buffer as the outputs from the jobs. You would effectively assign each job a memory range for output and it could have full access to the read-only data as input. These buffers could flip-flop for each high level task.

This means the rendering should just fit into this scheme. You can parallelize this step as well but the method depends on the target graphics hardware.

Alternatively at a given sync point just copy the data into a new rendering buffer and let the renderer work on that. This does introduce more latency though.

intransigent-seal

881

Author

October 07, 2010 07:12 PM

Noggs: Right, the workers could get references to the original data, but then the overseer has to ensure that it doesn't change that data while a worker is still processing it, which may be easy in some places and hard in others. The details can be fiddled around though, I'm sure.

Antheus: I'm not quite clear on the relationship between process() and push(); process() updates the state of the Blob from the inside, and push() updates it from the outside? So, is it just that some concrete Blob types will have push()-type methods and others will have process()-type methods? I don't see when you'd have both...?

[I'm off for the night now, btw]

John B

The best thing about the internet is the way people with no experience or qualifications can pretend to be completely superior to other people who have no experience or qualifications.

Object databases and data structures for concurrency

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Object databases and data structures for concurrency

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines