Sign in to follow this  
Ryan Lepidi

how do servers work with databases?

Recommended Posts

With regards to a highly data driven game such as an MMO, how does a server work with a database? Surely it doesn't read and write to the database when it wants to read and write data does it? Does it just store everything in memory and work from that, and then every X minutes or so update the database? I tried searching around but I couldn't find anything this specific. Also, how would this work in a multi-server environment? What if server #1 wanted to read a value, but server #2 has an updated version of that value but has not yet written it to the database?

Share this post


Link to post
Share on other sites
Mostly you work with data in memory, and push it out to the database periodically. You wouldn't generally write the whole world in one go, only the things that have changed, and typically staggered so you're not hitting the database with a massive number of queries all at once. Exceptions exist for important actions such as item or money transfer, which are typically saved to the database immediately to prevent exploits or data loss.

In a multiple server environment, each server tends to know who is the owner of a given piece of data, and ask the authoritative source for a copy of that data (or sometimes to transfer ownership over). And the flipside of that is that any information that needs to be visible across multiple systems can be broadcast to all interested parties when it changes. (Though you seek to minimise the requirement for such broadcasts, to increase your scalability.) Finally, it's generally accepted in many cases that reading out-of-date values is not necessarily a problem. (After all, we do this in the real world all the time - even light takes time to travel.)

Share this post


Link to post
Share on other sites
Thanks for your reply. I know this will vary from situation to situation, but how often would we write to the database? Are we talking every x seconds? Minutes?

Also, what did you mean its generally accepted to read out of date values? I don't see how that could be true.

Share this post


Link to post
Share on other sites
The duration between writes will vary depending on the game. Generally speaking, if you write all important transactions immediately, you might only have to write other data at log on and log off. Periodic auto-saves in the meantime might make sense but how often you do them will be the typical tradeoff between how important that data is to you and how good your database performance is.

All information is always out of date. When you look at the stars, you're seeing not what exists now, but what existed days, years, centuries ago. But for most purposes that doesn't matter. The effect is even more pronounced with sound, when quite often there is a noticeable latency between an event happening and you hearing that it happened. Given these fundamental limitations of our universe (ie. that information transfer takes time), your choices are either to (a) wait for the latest data, or (b) work with old info.

In many cases, working with old info is just fine. Say an NPC dies on the server but that data hasn't made it to your client yet. Your local values are out of date. You can still click on that NPC, and send commands to the server about it. The server just has to be careful and notice that the NPC you're asking about is no longer there. Exactly the same thing happens between servers - it's no big deal if someone gets hit on Server 1 and has 50% health but on Server 2 they have 100% health. Someone connected to Server 2 might see the 'wrong' health for a few frames before it updates, but that's not a problem, because Server 2's view of the NPC is purely cosmetic - all actions on the NPC are performed by Server 1.

Share this post


Link to post
Share on other sites
Depending on the database, you may be able to stream the attribute updates to the database through a memory cache. I e, as soon as you update an attribute, you also send that attribute update to a database cache; that database cache will keep copies of all currently active objects and attributes, and will round-robin committing dirty objects to the datastore. The main idea is that even if your application goes down, already streamed updates are still live on the cache server, and can be committed.

Of course, if power goes out on the cache, you're in more trouble, unless the cache in turn keeps a transaction log on disk for attribute updates. If you can come up with a way to partition cache servers based on object ID (for example), and they all talk to a single database back-end, you may be able to scale a single database instance in a manner similar to a federated data design. However, sequence ordering in transactions is much harder to achieve with a distributed caching system than with a true federated system.

Some databases implement this mechanism internally already (especially object-oriented databases, like Versant and friends).

Also, you probably want to treat different attributes as "persistent" vs "transient." For example, how many hitpoints I currently have, or what my current target it, may not necessarily have to be written back to the database (depending on game design). Same thing for avatar location. If the server crashes, you revert to full heal, no target, at your last bind location (for example).

Share this post


Link to post
Share on other sites
Hmm..this was just something I was thinking of, let me know if it sounds feasible. Basically, I have the following servers:

Gateway/Login
Object Cache
Database Server
Game Server 1
Game Server 2
Game Server n..

Game objects are initially loaded into memory in the game servers as needed. When objects are modified from the original copy, they are sent over to the object cache who then marks them for an update and updates them in the database routinely. When a game server wants to request an important object that needs to be current (like player money or something), it asks the object cache if it has an updated version of it. If not, it queries the database.

Is this reasonable or are there too many objects being moved around?

Share this post


Link to post
Share on other sites
Quote:
it asks the object cache if it has an updated version of it. If not, it queries the database.


I think the object cache should talk to the database, not the game servers.
This has two benefits:
1) The API for the game server is simple: it just talks to the object cache (reduced chance for bugs).
2) If the game server is compromised, there is no direct connection to the database to take advantage of (reduced attack surface).

Whether it's too many objects or not depends on three things:
1) How big the representation of each object is.
2) How often the objects change.
3) How efficiently you encode deltas in change.

In general, you'd "check out" the objects from the object cache to the simulation servers, so that only one simulation server is allowed to update a given object at a time. Also, you'd probably want to support transactions between multiple objects on the same server. For example, if I trade item A to you for item B from you, you want to make the exchange of objects a single atomic transaction, where either both you and I are updated and committed, or none of us are. For such transactions, you and me (the objects) will be pushed all the way to the database from the cache before the cache returns success to the game server.

And with this implementation, you can't trade between two users on separate servers, but that's probably OK for most cases. You can, of course, start implementing Transaction Monitor type functionality inside the game servers to support distributed transactions, but it becomes messy quickly at that point.

Share this post


Link to post
Share on other sites
Quote:
Original post by hplus0603

Whether it's too many objects or not depends on three things:
1) How big the representation of each object is.
2) How often the objects change.
3) How efficiently you encode deltas in change.


Does it make sense to provide object cache for data which is managed purely locally (such as mentioned trades and dynamic data which is handled by single server already)?

It seems to me that if some state is managed locally already, then adding cache to that would just add an extra step without much benefit. On migration, servers can serialize their local state, and external cache doesn't allow multiple authoritative servers anyway. And then there's the extra point of failure.

I would however consider it viable for data that can be considered global. Guild roster, leaderboards, various stats or auction houses or similar, as well as static world data (if stored in database).

ArenaNet published their design making heavy use of object caches with some absurd up-times, but in that case, players migrate between very transient instances, so chance of migration failing is high. They also reported all data being stored in a blob, so database queries were all or nothing, adding no benefit to actually being able to query the database.

Share this post


Link to post
Share on other sites
Quote:
adding cache to that would just add an extra step without much benefit


I think we already covered the main benefits:

1) the game server doesn't need to talk to the database (which means security and simplicity)
2) only a single cache talks to the database (which means it can aggregate updates for performance, for example)
3) asynchronous checkpointing from the point of view of the game server

A regular database server does some memory caching, to be sure, but it still has to commit each write to disk; an in-memory cache can coalesce multiple updates before committing. When it comes to single point of failure, your database is already a single point of failure in the proposed design. Presumably, the job of caching object properties and flushing back to disk iteratively is simple enough that you can debug it once and then just keep it running without failure other than hardware wearing out.

This is just one tool of many in the toolbox of someone building a large, distributed, real-time, interactive application.

Share this post


Link to post
Share on other sites
I've been thinking about the implementation of this more, and I don't get how it's possible to cache the results of a database query, locally or otherwise.

How would I translate a database query to an object retrieval in memory or vice versa?

Let's say my game server tells the object cache server it wants all users who have over $500 in their account. It's obvious how I would write it for a database query, but how could I ask the object cache for this? It seems like I would have to write some kind of custom querying syntax for retrieving objects and then somehow translate that into an sql statement in the event that it isn't cached. Is there some kind of easier way to do this that I'm not thinking of?

I mean even looking at a few object databases it seems like there is no querying syntax for them. They are set up as dictionary objects and you provide the index. But what if you want to do a search for everyone with over $500? Wouldn't that mean retrieving all user objects just to do the search?

Share this post


Link to post
Share on other sites
Quote:
Original post by Ryan Lepidi

Wouldn't that mean retrieving all user objects just to do the search?


It might. Object databases operate on objects - relational databases on, well, relational queries.

This is why choice of data representation matters. Considering you want to retrieve "all that have over $500". First determine what "all" means, than what "have over $500" means.

In SQL-based syntax, you might want to select all IDs from player table where Money column has value over 500. Result of such query would be an array of IDs, or better yet, a set of rows each containing an id.

In object database, there would be Player object, which would have Money property. So if you somehow query for (Player.Money > 500), you receive a set of Player objects.

The most trivial form of database caching can be achieved via memcached. That cache is basically a map (SQL_query, Query_result). When you wish to perform an SQL query, you first pass the query to cache. If cache contains such a query, it returns previous result, if not, you query the database directly. Obvious issues such as stale data and concurrency considerations apply.

It should be obvious that each type of database is suitable for a particular task. While object databases do support queries, they are better suited for OO approach, which ad-hoc query of (money>500) isn't.


In MMO state the ability to generate delta updates is often desirable. Instead of writing entire object, when something changes, such as money, you generate delta update, something in the form of (ID, Money_key, 500) tuple.

Object cache in this case could listen to delta updates and know which properties of an object changed. Then, after X seconds or after Y updates, it would perform relevant SQL queries to update those values.

The benefit of such approach is that at any given time, only a small subset of properties will be changing. If two players are in combat, their health will be changing twice per second. By using such cache, this would result only in one SQL query once in a while, as cache would see fit.

This approach achieves two things. It relieves the SQL server of many small queries, and additionally allows all changes to be committed as single transaction, rather than many small transactions.

Share this post


Link to post
Share on other sites
The object database (or cache) should keep an index of properties. If you index the "money" property, then finding the set of objects with the "money" property greater than $500 isn't very hard.

You can view the general backing store of an object as a simple table:

Object ID, Property, Value

The "object" is then the self-union of that table for all tuples of Object Id == your id.

However, databases don't like N-way self-joins like that (a 9-way self-join of ours basically killed Oracle a few years ago...), so you might want to create a table per property instead. Whether an object has a given property or not is determined by selecting that objects' ID out of that property table.

As long as you cache the joined object in RAM, and only write changes back to the database, either of those approaches can be both efficient and flexible.

Share this post


Link to post
Share on other sites
Ok I'm trying to figure out how I can implement this. Let me know what you guys think - is there anything I'm doing inefficiently or wrong? Am I forgetting anything that is going to kill me later? I am using Stackless Python 2.6.1 (3.0 isn't out yet).

If you aren't familiar with python, a good feature of it is you can essentially "hook" the function that is called when a specific field of a class is called. Basically that means that I can use this function to see if a specific field is initialized when it is called; if it has, I return it, if not, that means I need to grab it from the database, set it, then return it. So that is how I plan on using an object frontend with the database backend.

Given what I said above, objects are only loaded into memory as needed. This means that I can't do a "Select all players with money > 500" query since not even all of the objects are there to query. In this case, I just skip the cache and call the database.

When I update a value in an object, I hook the call feature again and I keep a dictionary (like a map in c++) that has all the variables as keys and a boolean value on each that is set to TRUE if it has been changed and FALSE otherwise. When it's time to send the updates, only the values that have changed are updated, and all flags are set back to FALSE.

Any thoughts?

Share this post


Link to post
Share on other sites
Quote:
that is how I plan on using an object frontend with the database backend


Your scheme sounds like something that implements the lazy update. However, doing lazy object inflation is going to introduce lag each time you want to get any property from an object, the first time. This will likely lead to poor user-visible performance. I would inflate the entire object if any part of the object is needed, because running a query for all properties from the database at once is likely to be significantly more efficient than running single queries for each property.


Share this post


Link to post
Share on other sites
Quote:
Original post by Ryan Lepidi

This means that I can't do a "Select all players with money > 500" query since not even all of the objects are there to query.


Why would you want your server to run such query in the first place? For an RPG/RTS, the most touched properties will probably be position/orientation and health/energy/mana.

How many objects will be on one server? Tens of thousands shouldn't be a problem as such, at least not memory-wise.

And what kind of simulation will it be?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this