Project Darkstar

Started by
75 comments, last by hplus0603 14 years, 7 months ago
Quote:Original post by hplus0603
Quote:By allowing the system to forget some recent updates when there is a crash -- but still maintaining consistency -- the system is able to provide much better latency.


To make this clear (to me): Is what you're saying that, if I for reporting reasons mirror the state of the world out to a separate database system, and there was a crash, then my reporting system would be out of sync with the actual world state?


No, thats very much an over-statement.

There is a *small* chance that this could happen, limited to the window of time between when the transaction coordinator calls your external DB service to commit its transaction and before it completes its own commit to the Data Store.

Quote:
And if I have to mirror all commits out to a reporting system anyway, why wouldn't I just use that system for my datastore/back-end? Scaling reads is easy, compared to scaling writes...


Because such an external data store would not be fast enough nor execution-aware enough to provide the data needed on the fly to handle events.

This is the same confusion Arbitus and I resolved above. The PDS Datastore is neither designed nor intended to replace a SQL database for statistical tracking.

What it replaces is the typical in memory model of game state. By doing so it makes that game state fault-tolerant, reliable, persistent and scalable across many cooperating computers. As you yourself pointed out, scaling writes is hard and the typical duty cycle on the in-memory model is 50% read/50% write.
That is why we needed our own technology and we had to cut anything that burned time that we didn't need to make that replacement.

The datastore's transactional nature, combined with the event execution model, means that game code that is written as if it was running on a single CPU can be farmed out across all the processors in all the boxes in the back end in a race-proof and deadlock-proof manner without any explicit synchronization on the part of the game programmer.

This is the most important thing about the system and was my original design goal when I started the project. It allows you to write highly parallelized and fault tolerant code without any knowledge of how it is being processed.

The rest of the advantages such as fail-over and durable persistence just "fell out" of the design.

Edit: If it helps any try this... the PDS as a technology does not compete with MYQL. Its closest competitor is probably Terracotta though each of those technologies do things the other doesn't.

Edit2: This might help too. Looking *just* at the Datastore and the functions it provides to the rest of the system, its most like a "tuple-space." The PDS execution model can be categorized as a "flow of objects" model, but with some very unigue twists. Unfortunately the existing tuple-space products we looked at did things we didn't need and didn't do thing we did, or we would have used one of them.

[Edited by - jeffpk on September 14, 2009 1:18:05 PM]
Advertisement
Quote:Original post by hplus0603
Quote:By allowing the system to forget some recent updates when there is a crash -- but still maintaining consistency -- the system is able to provide much better latency.


To make this clear (to me): Is what you're saying that, if I for reporting reasons mirror the state of the world out to a separate database system, and there was a crash, then my reporting system would be out of sync with the actual world state?


Yes. That would only happen if Darkstar crashed after the external update had committed but before the change was flushed asynchronously to disk for the database backing Darkstar's data service. That certainly could happen, though.

Quote:Original post by hplus0603
And if I have to mirror all commits out to a reporting system anyway, why wouldn't I just use that system for my datastore/back-end? Scaling reads is easy, compared to scaling writes...


Committing every change made in a Darkstar transaction to an external database is likely to increase the latency of the system a lot, so that probably isn't a good idea.

If you really did need/want all your Darkstar data to be stored in a relational database, I could picture your using a data service implementation that did that. I have not done the experiment, so I don't know what the performance would be like. My concern is that the relational database would provide longer transaction latencies -- not throughput! -- but I haven't actually tried it. I don't think that building a relational implementation of the DB layer underneath the data store would too hard to do, though.

- Tim
Outside of performance concerns (which are always
at the forefront of PDS decision making) the biggest issue I could see with a
Relational DB as a PDS Data Store is the ORM. It seems to me you would either need an ORM that worked "on the fly" or you would need to preprocess your app to produce the ORM for the DB to use and keep them in synch.
Quote:The datastore's transactional nature, combined with the event execution model, means that game code that is written as if it was running on a single CPU can be farmed out across all the processors in all the boxes in the back end in a race-proof and deadlock-proof manner without any explicit synchronization on the part of the game programmer.


When you say this, are you claiming that you can, TODAY, farm out the object update execution and object store code across multiple execution nodes? Because that's not what the roadmap seems to indicate.

If not, then what pieces, if any, can you farm out across all the boxes in the back end, TODAY?

On a related note, I find that separating very clearly between "now" and "in the future" is crucial when communicating, because if you say something like "we can architecturally support ..." and that means that you've thought about it, but not written a single line of code yet, nor know exactly when you will, then that is part of what I think some in this thread call "misleading." That may very well be how I was misled about PDS claims for capabilities many years ago.


Second issue: When all object links are in the object database, and you introduce some new game element that needs a new kind of link established (say, an index of all players that have a bind point in Darkfell, or whatever), then you have to write code to manually rip through your entire datastore and update all the player objects. That's a simple example of schema migration.

Schema migration, however, can get really hairy, and when you don't have relational algebra to do it for you, it's been my experience that it's doubly so. But perhaps part of what you developed in the last few years includes something smart to make that better? If so, could you post a specific link?
enum Bool { True, False, FileNotFound };
Quote:Original post by hplus0603
Quote:The datastore's transactional nature, combined with the event execution model, means that game code that is written as if it was running on a single CPU can be farmed out across all the processors in all the boxes in the back end in a race-proof and deadlock-proof manner without any explicit synchronization on the part of the game programmer.


When you say this, are you claiming that you can, TODAY, farm out the object update execution and object store code across multiple execution nodes? Because that's not what the roadmap seems to indicate.


You asked the wrong question. So let me answer your question and then the *right* question.

Yes, you can build a multi-node backend TODAY and distribute load.

No, you would not WANT to do that today. Thats because although multi-node technically works right now, its performance is crap. This is neither surprising nor atypical of building distributed processing systems. The first cut always has bottlenecks that impede performance. This is what the team is working hard on fixing now.

So if my language confuses you, I'm sorry. To be clear, when I talk about what the PDS "does" Im talking about what the model was designed to do. Implementation isn't complete yet and I'm happy to admit that. However what is complete is in use in both large and small game companies and WHEN multi-node is complete it will require no app-level code changes to run on.

Is that clearer?

Quote:
On a related note, I find that separating very clearly between "now" and "in the future" is crucial


Well I hope thats clearer now.

Quote:
Second issue: When all object links are in the object database, and you introduce some new game element that needs a new kind of link established (say, an index of all players that have a bind point in Darkfell, or whatever), then you have to write code to manually rip through your entire datastore and update all the player objects. That's a simple example of schema migration.


To simply add a link all you need to do is add a field to the ManagedObject.
Such additions are "serialization compatable" changes and the system automatically handles them.

Now there *are* such things as serialization incompatible changes. I mentioned this before in this thread. Its a known area where the PDS world is less then ideal and something the community is well aware of. There are some workarounds. For instance in the MW we wrote code to dump important data in an object neutral format so we could read it back into a new codebase.

This is less then ideal though and Im currently building what amounts to a "merge tool" at BFG that analyzes the old code base and the new one and finds serialization incompatible changes. It then prompts the user to resolve how that change will be mapped. When all chnages are mapped it goes into the binary serialization data (which is a well established standard) and re-writes it to comply.

I am HOPING that BFG will let me release this tool to the public as this is actually a general problem in serialization thats needed a proper fix for a long time. If not I will at least try to document exactly what I did so others can reproduce it.

I think we've actually rat-holed many times in this discussion. I just posted a 10,000 foot view answer to your whole question of game state over on the PDS boards. I hope it clears a lot up.
Quote:Original post by jeffpk
I am HOPING that BFG will let me release this tool to the public as this is actually a general problem in serialization thats needed a proper fix for a long time. If not I will at least try to document exactly what I did so others can reproduce it.


This is the general problem with this discussion that bothers me, and why I feel it's difficult to discuss anything that does not involve general hand-waving. I'm sorry I need to resort to this word again.

Schema migration is not a trivial problem, it's not a new problem, and is a well understood and researched one. There is no need for cutting edge research on it, just tweaks to existing frameworks and their issues.

With PDS you make it sound like its trivial and already supported, yet in next paragraph there might be a tool available some time in the future.

I will only point out ZeroC documentation (pdf), namely chapter 41 on FreezeScript. It clearly describes the problem, the solution, as well as corner cases which can and cannot be handled.

There is no "it's automatic" unless it really is. There is no "sometime in the future" or "someone implemented it". It is solved, implemented and available right now. Not only that, but it has been of production quality in 2006/2007, when we used it to great success.

The only "problem" with above solution is that it is a solved problem, available either as open source of in commercial form, and it works. There is no need to argue what it is or isn't, why something is or isn't. It has been available for years, there are users well experienced in it, and at least as far as our project was concerned, was fully usable out of box. It remains in this form today.


I would really wish this discussion were of technical and not marketing nature.

What does PDS solve that is different from ZeroC's solution, or how does it improve on CORBA persistence service? They are both widely used, stable, well understood platforms, and make for excellent apples-to-apples comparison.

Where can I read documentation of quality that matches that of ZeroC, including full disclosure on what is and isn't available, how everything is implemented (in documentation, not as source), and what trade-offs and gotchas there are?
Quote:when I talk about what the PDS "does" Im talking about what the model was designed to do


That is what's quite confusing, and in fact what seems to turn off most engineers I've talked to. That is marketeer-speak, not engineer-speak. If you want to make statements on what you hope the architecture will be able to accomplish, then they should generally be qualified as "the PDS architecture allows ..." or similar. Saying "PDS does ..." implicitly means that I can, today, get a deployment, push a button (or write a few hundred lines of code), and get the result that you claim.

A less discerning cusomer/engineer will read those statements, and believe that he'll get that capability today. I've seen that marketing within enterprise software a lot, but mostly it's clearly framed within roadmap context. Part of my critique of the PDS documentation and communication is that such context is largely missing.

Regarding schema updates: It sounds like there's no good solution within PDS other than writing code right now, which is fair, and it sounds like there might be some tools to help support these cases in the future, which is also fair -- but again, is a forward-looking statement.
enum Bool { True, False, FileNotFound };

This topic is closed to new replies.

Advertisement