AgentShiva

Database Structure for MMO's

Recommended Posts

AgentShiva    128

Let me preface this post with the obligatory. This is a hobby project not seeking to be the next "wow-killer" Just something I enjoy because I am inquisitive.

 

That being said, I have been following a series of tutorials by Christian Richards on YouTube. He uses Photon Server and Unity to do some neat things. I am trying to expand upon his architecture and database models.

 

I am at the point of creating characters and I got sidetracked by a bit. I've been reading up on database techniques and I think I have an idea of where I want to go but am not sure if its the correct direction to go.

 

Basically, I have a table that stores the player characters containing an ID, name, gender, and other character related things. I wanted to add a junction table to store the players skills as well as the xp they have earned in that skill. It seems that a Many-to-many relationship would be correct for this. As in Skills can be earned by many players, and players can have many skills. I think this is at least one way to go about this issue but am curious of other ways. I have looked at Planeshifts (open sourced :) ) database and they do something similar, although it seems it really depends on your gameplay how your database is laid out.

 

Would this method also be reasonable for storing items in players inventories, achievements players have earned, and other information for things that all players can have?

 

Also, I was planning on having a SQLite database client-side to store nothing but visual information. Like item stats, achievement info, quest text info and much more. The idea being that the server can send an id and the client can query the database for the proper way to display the object in question. I suppose this is more of a asset management question, but I like the idea of minimizing server bandwidth by sending id's only.

 

Thanks for your time and responses!

Share this post


Link to post
Share on other sites
NLScotty    954

I did something like this for a project on uni. I will post some pseudo code here, I hope it helps :)

Table Character {
int CharacterID PRIMARY KEY,//More info depending on the game,
int Gen,
//More info depending on the game
}

Table Skill {
int SkillID PRIMARY KEY,
string Name
//More info depending on the game
}

Table Item {
int ItemId,
string Name
//More info depending on the game
}

Table SkillInstance {
int CharacterID,
int SkillID,
int level //If you need this?
//More info depending on the game
FOREIGN KEY(CharacterID) REFERENCES Character(CharacterID),
FOREIGN KEY(SkillID) REFERENCES Skill(SkillID)
}

Table Inventory {
int InventoryID PRIMARY KEY,
int CharacterID,
int Size,
//More info depending on the game
FOREIGN KEY(CharacterID) REFERENCES Character(CharacterID)
}

Table ItemInstance {
int InventoryID,
int ItemId,
int Spot,
FOREIGN KEY(InventoryID) REFERENCES Inventory(InventoryID),
FOREIGN KEY(ItemId) REFERENCES Item(ItemId)

}

I don't know if this is the best, but it works fairly well. We also had quite a high mark for it, so I am not sure if that means anything.

Share this post


Link to post
Share on other sites
Alberth    9528

I didn't touch data bases for about 20 years, but there are normal forms for them, ie https://en.wikipedia.org/wiki/Database_normalization

Each level strips away redundancy, and the rules are fairly simple afaik, so just make a set of tables and relations, and then verify for being in each level of normal form, and if not, change it.

From what I remember, you need to go to level 4, but euhm, it's 20 years ago :P

 

You client-side information seems static information. Why would you put that in a data base? A simple file or hard-coded table would work too right? Don't use a data base because they exist, use them because they are the right tool for the job.

If the information is not static, how does that information arrive at the client, and how does it stay up-to-date? Is the additional tracking of what the client has worth the reduction in network bandwidth?

(if you send a network packet, sending a slightly longer one isn't much of a deal, especially if you stay within the packet size.)

 

Share this post


Link to post
Share on other sites
AgentShiva    128

@NLScotty

That is exactly what I was talking about! That's an amazing way of showing the relationships. If I was at home I would have thrown together the SQL commands for the tables. Also, the comments "depending on the game" I would love to see more stuff like that. Just a general guideline.

The only other method I can think of would be serializing the information and save it to character table. Not a fan of that idea for several reasons though.

 

@Alberth

Thanks for that link, I'll be sure to read it this evening! As far as client-side databases it was more for ease of content creation. I know I could just as "easily" build tools to edit a csv or xml file, and that doesn't sound like a bad idea. Perhaps I was so wrapped up in the server side stuff that I got a bit excited and ahead of myself.

I guess my main goal with the client side database is ease of inserts. I could live deploy a new item to players by sending a simple server response on login. Not that I couldn't do this with flat files. Again I may have been a bit database happy :).

Share this post


Link to post
Share on other sites
hplus0603    11356

There are three ways of doing this, each with benefits/draw-backs.

The most straightforward way is to create a table, indexed by "character ID," which contains the skill level for each skill -- a column per skill. This has good performance, makes it easy to analyze, and is straightforward to marshal to/from internal data structures. The draw-back is that each game design change (new skill, remove skill, etc) requires a schema change (add/remove/change columns.)

The second is to create a multi-join table; character ID -> skill ID -> skill level. This makes it easy to add more skills, without a schema change. It makes it slightly harder to marshal to/from game structures (you may or may not get all the skills, may get unknown skills, etc.) Indices get taller, performance slightly slower, etc. But it's quite flexible.

The third is to create a table with "character ID -> JSON blob" and leave schema management to your code marshaling to/from your JSON data. This is super flexible, and has only one row per character, but loses all the other benefits of a database -- you can't easily query "what's the average archery skill level of all players" and such, at least not without a table scan and expensive JSON parsing expression.

Which one do you go with? Depends on your goals. But don't underestimate the simplicity of one-row-per-character, one-column-per-skill!

 

Share this post


Link to post
Share on other sites
AgentShiva    128

@hplus0603

I have given all three of these some thought.

I agree with every point you made, and they represent mostly how I feel on each.

I want my solution to be as flexible as possible to allow me to play with many different kinds of game systems, while maintaining a standard. The first solution of column-per-skill seems pretty far from 1st normalized form as I understand it (thanks @Alberth!). Although it would be easy to implement, I agree with your comment about updating column schema's.

The difference between the second and third options (junction tables and serialized data) aside from what you've stated, seems to be do I want the database and ORM to handle the conversion, or my server logic. I think as a hobbyist programmer I would much rather have software written by others do the 'heavy lifting'.

After reading through the link (and diving down the wiki-hole if you will) I think that the junction tables are the closest to 1NF. So it seems I have made up my mind.

 

As for client side data storage I am still on the fence. I like the idea that a SQLite database abstracts the data by at least one level. I realize that it wouldn't protect it against someone hell bent on changing it, but it certainly wouldn't be as simple as editing a CSV. Also, in terms of maintainability, it seems like a database would be ideal. Tables with clear names, and the ability to easily find, insert, or remove data. But I must admit this would all be possible using a flat file (thanks cntl+f haha). And realistically both solutions would need to at least partially be loaded into memory for the duration of the applications process. Still really not sure on this one.

Share this post


Link to post
Share on other sites
hplus0603    11356
Quote

 

The first solution of column-per-skill seems pretty far from 1st normalized form as I understand it


 

It's totally normal, if you consider a character as being an entity with a well-defined set of skills.

If you believe that skills are separate entities unto themselves, and they come and go (rather than the character just raising/lowering/enabling/disabling them) then, yes, you need a separate table to join characters to skills.

I would recommend that you do NOT go for flexibility right now, because making just your first game is going to be hard work enough -- you're not going to have lots of time to try many different things, at least not to the point where you will put them into the database. That being said, if you really believe this is an area of quick iteration, then just slam 'em all into a single JSON blob and call it good!

Separately, I would advise NOT to use an ORM. Ever. For any application. Objects in RAM behave differently from rows in a database, and the impedance mis-match that happens when you try to join them, ends up killing projects. And, what's most insidious, is that this kind of death only happens towards the end of the project, where scale makes itself known. Projects that manage to pull out of this, typically re-code their ORM integration in raw SQL, and exactly how painful that is, depends on how heavily they relied on the ORM "magic" before then.

 

http://blogs.tedneward.com/post/the-vietnam-of-computer-science/

https://blog.codinghorror.com/object-relational-mapping-is-the-vietnam-of-computer-science/

http://seldo.com/weblog/2011/08/11/orm_is_an_antipattern

Programming with relations in memory is actually, in general, a better model than programming in "rich" objects, anyway -- this is the learning from the "data oriented design" movement that's basically taking over high-performance gamedev in the last 15 years or so.

Share this post


Link to post
Share on other sites
Kylotan    10010

Advice from someone who had to write the database schema and data migration tool for a game that did not plan for database schema migration - plan for schema migration. :)

The data you choose to store will change. Extra rows of data are mostly easy as the only significant clashes are your primary keys. Make sure you don't have tables that gain new rows during design and play - for example, if players can invent new object types, but you/your designers can too, you may have problems merging those 2 tables when you deploy the new item types to the live database. So when using a relational DB for MMO data I designate some tables as read-only during play - these are the ones designers can add data to. Often the data model needs adjusting accordingly.

Changing table columns typically requires some sort of explicit migration strategy. Adding a column is simple enough, providing you have a default value specified. Deleting a column is also fairly simple as long as nothing refers to it. Changing the type of a column, or expanding a column out into multiple columns, or maybe changing it into a reference to a new table - that gets awkward. Look into tools that will manage this for you. Usually you have to write an upgrade script (sometimes aided by some automated process that fills in the basics for you) and an accompanying downgrade script so that you have a one-command approach to changing database versions, and reverting in the event of problems.

Another way to bypass all of this is by using BLOBs or JSON fields, and have the schema entirely inside the application, writing the whole object into one column. Any migration can be performed when the object is loaded, and a full migration is just a case of reading and writing every object. Personally I like this method because game database schema change a lot during development and being able to manage it in one place within the code is more amenable to fast iteration of new features. The downsides are that you lose a lot of the usual benefits of relational DBs, such as a well-known query language, plenty of mature tooling, etc.

SQLite is a reasonable format for client-side data if you know it fits the relational model. Given that you're usually using it as a read-only data source you don't have to worry about the migration issues.

Share this post


Link to post
Share on other sites
hplus0603    11356

Adding a column is simple enough

The simple tools to "alter table add column" will lock the table for the duration of updating the data on disk, at least with MySQL. For heavily used tables, this is a non-starter, unless you're OK with hours of downtime.

There are tools that alter the schema "online" by re-writing the data in the background; we have tables that are big enough that this work never completes and finally declares failure after days of trying.

So, adding columns on small tables (say, below ten million rows and below a gigabyte in size) is quite doable; once you get bigger than that, nothing becomes simple anymore :-)

Then the second question is: How much do you need to worry about this? 99.99% of games never get that big. Solving that problem too early, means you're wasting time solving a problem that you don't need to, which is time that you could have used to work on more immediate problems, like "is the game fun?" and "can you actually get players?"

(Almost) Every successful system is full of shortcuts, that have to be solved at some time in the future. Knowing where a little bit of extra engineering will save you lots of time later, and ideally also make you faster while building that future, is the trick. Over-engineering is not the solution!

 

Share this post


Link to post
Share on other sites
AgentShiva    128

This is exactly why I came here for help! Thank you @hplus0603 and @Kylotan

I don't know how I feel about going forward without using NHIbernate. For me its about ease of use, and I really never plan for this project to have the Massively part of MMO. I do wonder if I'm overthinking this, as I often do.

I don't want to seem like I am discounting the ease of multi-column tables, or JSON blobs. I am just not sold on them.

I think the bigger issue for me know is going from ORM to no-ORM. I read the articles your posted @hplus0603 and they're hard to argue with.

The real lesson here is I need to learn how/why to make these decisions lol.

Share this post


Link to post
Share on other sites
hplus0603    11356

You know what they say:

"Good decisions come from experience. Experience comes from bad decisions."

If you just want to write it as a learning exercise, feel free to use whatever ORM you want. It'll probably run fine at small scales. There are plenty of other things you also need to learn to build an MMO-like system, so perhaps this isn't the time when you learn to talk SQL directly, and you can save that for some other project.

Good luck!

Share this post


Link to post
Share on other sites
AgentShiva    128

After diving down the rabbit hole a little further I am starting to think I may need to drastically rethink this.

My current design is to have Skills that are earned by doing tasks associated with those skills (eg. WoW's original weapons leveling system, or SWG's skill system) And as you gain XP for that skill you can buy and unlock other skills. Each skill will grant new abilities, and each ability can have any number of effects.

I don't know if creating junction tables for all these things is a good idea. I can see needing a query for the player character, their skills, their abilities, and finally the effects related to those abilities. It seems excessive. Perhaps storing the Skills, Abilities, and Effects in a flat file and loading it at runtime is a solution. My only concern is memory usage. Once I start adding more to each of those how soon before it gets to out of hand? I think some tests are in order.

Share this post


Link to post
Share on other sites
Kylotan    10010

First, doing a complex query is not a problem providing you understand how to perform that query and it's not being performed super-often. Whatever DB structure you can come up with for your game, there are 100s of businesses that do more complex queries 10x as often as you would, and it works fine.

Second, you don't need to be doing most database queries during gameplay anyway. You can do them at load time. The chance of you having too many skills and abilities to fit in memory is zero.

Share this post


Link to post
Share on other sites
ApochPiQ    23064

One thing that hasn't been mentioned much is indexes. A good index setup will make normalized data very easy to join together, so your queries across junctions won't be particularly slow, even with thousands of skills per character.

Share this post


Link to post
Share on other sites
hplus0603    11356

A good index setup will make normalized data very easy to join together, so your queries across junctions won't be particularly slow, even with thousands of skills per character.

 

A thousand-way self-join, versus fetching a single row with a thousand columns, is probably a measurable difference. The main reason being that the index for the table of a thousand skills per player will be a lot "taller" and thus blow out more block cache. (I presume that the primary key on that table would be playerid, skillid, so it would be a contiguous read at least -- if you get that wrong, such as skillid, payerid, then you're in for a world of hurt as you grow.)

 

Share this post


Link to post
Share on other sites
ApochPiQ    23064

At the point where you legitimately have thousands of skills allocated to a given player, I'd actually recommend denormalization and just stashing the IDs of skills you have in a bit vector or something.

Share this post


Link to post
Share on other sites
frob    44974

The key there is to be mindful of your tools and tech.  

Too many people equate "data" with "RDBMS". There are many other ways to store and process data.  Some data is a great fit with relationships and queries that work with a relationship database, others are a terrible fit and need a different data system. 

Share this post


Link to post
Share on other sites
AgentShiva    128

I don't plan on having thousands of skills per player. Something closer to 100 at the most.

Each skill costs points as well as XP related to that skill. There will be a fixed number of points available, with higher ranks costing more.

I may end up serializing the player skills every X milliseconds and storing it to the DB.

Share this post


Link to post
Share on other sites
hplus0603    11356

I may end up serializing the player skills every X milliseconds and storing it to the DB.

Presumably, skills only increment at discrete times, not continually during gameplay. You only really need to do that when skills actually change. You could use a dirty-bit on your in-memory skill values, and a dirty-bit tree for all pieces of your character, to know whether the character needs to be saved, and if so, what parts.

Share this post


Link to post
Share on other sites
flodihn    281

At one point, I think it would be healthy to experiment with a different approach.

Just store accounts and characters in a key/value store, currenlty my favorite storage is Riak.

Give new accounts and character a unique id to use as keys, uuid is a good option (https://en.wikipedia.org/wiki/Universally_unique_identifier). Then store all information bound to the account or character as the value in the structured way. My favourite format is JSON.

You can link accounts to characters by having a list of characters uuids in the account data.

Then, instead of making queries to the database, just load up the JSON into the memory of your server, make alterations in memory and occation dump JSON to the database.

It would look something like this:

 

{
    "account": {
        "uid": "123e4567-e89b-12d3-a456-426655440000",
        "phash": "JDUkTW5mc1E0aU4kWk1UcHBLTjE2eS90SXNVWXMvb2JIbGhkUC5PczgweVhoVHVycEJNVWJBNQ==",
        "email_encrypted": "UkNyOXFaajYvanBkaVJHNXVmZnV6THpHR0pSMDVUU2U=",
        "email_hash": "17300987ba9482d229e6bc9f3463b4efbdf2564d",
        "character_uids": ["522e4867-f89e-13d6-b672-422615122561", "612f2357-d21c-23e4-c152-122665727512"],
        "subscription": "bronze"
    }
}

Note that texts that contains weird characters is in most cases best to base64 encode, otherwise jsons parsers might not parse correctly. In this example I would recommend passwords to be hashed with SHA-256 or even SHA-512 and encode to base64 encode before storing in the db (I use bcrypt to generate password hashes).
You probably don't want to store people's emails in clear text in your db, so encrypt using a well known method like AES. But now we have the problem that emails probably need to be unique (one email limited to one account), so we can hash it the email and put a secondary index on it (email_hash). This means we could quickly find if an email is already in use in our system.

{
    "character": {
        "uid": "522e4867-f89e-13d6-b672-422615122561",
        "name": "foobar",
        "skills": [{
            "combat": [{
                "1h_sword": 32,
                "2h_sword": 12,
                "mace": 82,
                "shield": 54
            }],
            "cooking": [{
                "pizza": 99,
                "soups": 23,
                "grilling": 23
            }],
            "survival": [{
                "acrobatics": 56,
                "swimming": 23,
                "skinning": 52,
                "traps": 13
            }]
        }]
    }
}

 

Edited by flodihn

Share this post


Link to post
Share on other sites
hplus0603    11356

If you feel that RDBMSes are not a good idea for an online game, what data and experience has led to that?

There are tons of successful games that use RDBMSes correctly, and get great performance. There are tons of instances where schema-less key/value stores end up costing more than they help, because you can't easily use secondary indices (unless you turn your "schema-less" storage into a poor man's "pseudo-schema-ed" storage.)

And vice versa -- use a RDBMS incorrectly, and you will be slow.

Regarding storing acquired skills as a bitfield (didn't see that two months ago) -- if your table is (characterId not null, skillArchery default null, skillSneaking null, skillSmithing null, ...) then most RDBMS-es will turn the NULL columns into a bit field under the covers, and you still get to marshal/de-marshal as if it were a full struct. You get "null" back for the skills you haven't unlocked, and non-null values for the skills you have a rank in. Seems like a great match to me, especially for a first iteration.

 

Share this post


Link to post
Share on other sites
Suvitruf    107
1 hour ago, hplus0603 said:

If you feel that RDBMSes are not a good idea for an online game, what data and experience has led to that?

There are tons of successful games that use RDBMSes correctly, and get great performance. There are tons of instances where schema-less key/value stores end up costing more than they help, because you can't easily use secondary indices (unless you turn your "schema-less" storage into a poor man's "pseudo-schema-ed" storage.)

And vice versa -- use a RDBMS incorrectly, and you will be slow.

 

With SQL db you spend a lot of time and resources when your schema changes. That's my point. I'm not sure about performance. It depends on cluster size.

Btw, RethinkDB supports secondary indexes (:

 

Share this post


Link to post
Share on other sites
hplus0603    11356

With SQL db you spend a lot of time and resources when your schema changes.

Do you have specific experience with this that you can share?

With schemaless implementations, you instead spend a lot of time debugging bugs that happen because some code puts in data in a format different than what some other code reading the data expects. The bigger your game gets, the more likely those bugs are. I'd rather take the small cost of schema updates than the lack of type checking and "have to be able to read any old version of data" problems of schemaless. That's based on specific experience I have with both, so yours may be different :-)

 

.

Share this post


Link to post
Share on other sites
Suvitruf    107
6 hours ago, hplus0603 said:

 

 

Do you have specific experience with this that you can share?

With schemaless implementations, you instead spend a lot of time debugging bugs that happen because some code puts in data in a format different than what some other code reading the data expects. The bigger your game gets, the more likely those bugs are. I'd rather take the small cost of schema updates than the lack of type checking and "have to be able to read any old version of data" problems of schemaless. That's based on specific experience I have with both, so yours may be different :-)

 

.

I've worked with SQL in CRM field. Usually data structure in this field doesn't change a lot.

Yeh, I understand. For us...We have chosen NoSQL 'cause we can iterate and prototype faster.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now