Jump to content
  • Advertisement
Sign in to follow this  
Unduli

Database optimization

This topic is 2142 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello there,

 

Well, first of all hope title doesn't falsificate.

 

In a browser game (especially if persistant) or even at a simple social networking tool, data needs to be stored increases significantly in time.

 

For example in Twitter, people keep twitting and this piles up. That's ok but what I wonder is,

 

People rarely check for outdated twits but new ones. Is there any method/trick to serve newer data quickly in exchange of serving all at same speed?

 

I read that (probably) Google puts old stuff on a slow but (power) cost server to lower expenses. Considering I'll not have a server farm in most scenarios, wondered if there's room for anything else than playing to lower cost.

 

Thanks in advance

 

 

 

 

Share this post


Link to post
Share on other sites
Advertisement

That or you just throw old data away. If you aren't actively deriving revenue from having it accessible, why design your game such that you have to store it?

 

How many people actually actually appreciate that FaceBook has made all their posts searchable back to the beginning of time? Most of the data is noise, so you really only need a trailing average...

Share this post


Link to post
Share on other sites

Throwing away old data is a good option as mentioned by swiftcoder. Knowing that a dead player built a swordsman in 2004 isn't very useful.

 

If you can't (or don't want to) delete old data, you can limit your data sets to only return the useful information. For example you could: only return the last two weeks worth of data by default. Or limit the data to things that are only relevant to you like your closest 10 neighbors. The only "tricks" to these are providing the parameters you need to the queries and writing them with that in mind.

 

...
where 
    newsItem.CreatedDate >= @beginDate
...

 

And general SQL optimizations would apply including: good data model design, indexes on foreign keys and commonly searched on queries, limiting your result set, etc.

 

SQL is pretty darn awesome at digging through truly MASSIVE datasets provided you do it right. :)

 

- Eck

Share this post


Link to post
Share on other sites

That or you just throw old data away. If you aren't actively deriving revenue from having it accessible, why design your game such that you have to store it?

 

How many people actually actually appreciate that FaceBook has made all their posts searchable back to the beginning of time? Most of the data is noise, so you really only need a trailing average...

 

Actually, I consider a "brief" timeline and a communication part like FB posts, so throwing away is not first option. Maybe putting archaic data to another database makes things better?

 

 

Throwing away old data is a good option as mentioned by swiftcoder. Knowing that a dead player built a swordsman in 2004 isn't very useful.

 

If you can't (or don't want to) delete old data, you can limit your data sets to only return the useful information. For example you could: only return the last two weeks worth of data by default. Or limit the data to things that are only relevant to you like your closest 10 neighbors. The only "tricks" to these are providing the parameters you need to the queries and writing them with that in mind.

...
where 
    newsItem.CreatedDate >= @beginDate
...

And general SQL optimizations would apply including: good data model design, indexes on foreign keys and commonly searched on queries, limiting your result set, etc.

 

SQL is pretty darn awesome at digging through truly MASSIVE datasets provided you do it right. smile.png

 

- Eck

 

will use LIMIT for sure but if player wants to get exactly archaic data, seems options are limited.

 

Actually real question is, is there a cache like mechanism (DB in memory ?) making this worthy?

Share this post


Link to post
Share on other sites
While throwing data away might be an option, from the perspective of games this usually falls into the category of splitting your data into hot and cold records. (I.e. who's actually playing? What items are actively in use, etc.) Basically you want a LRU (least recently used) cache layer between you and your database. High density web sites tend to use memcached, or related, in front of the the SQL servers as an effective method of implementing this without a lot of work. In order to implement it yourself you just have to provide an API in front of the data retrieval which first checks for items in the cache layer and uses that instead of sending off an SQL query. With a decent division of information in the DB tables, this can reduce the load on the Sql server by a hundredfold pretty easily such that the occasional queries against really old cold data are not a problem in the middle of regular hot data accesses. As with anything there are more details but this is a starting point.

In terms of scaling when applied to games work (should apply to just about anything), it scales very well and even dynamically. By simply adjusting the memory size of the cache you can increase/decrease load on SQL to the point that it makes the most sense for you. If things get really big, it scales horizontally also, you can move the cache onto a standalone box with basically nothing more than massive amounts of memory. If that ends up bottlenecking you can split it over multiple machines by breaking up which caches contain different portions of the access API. At the point the DB starts slowing down even with the cache on top you can subdivide the SQL servers into primary table access for each cache separately etc. This is basically an algorithmic optimization instead of arbitrarily throwing data away (which still may be valid) or just throwing bigger boxes at a problem hoping vertical scale can keep up.

Share this post


Link to post
Share on other sites

Actually real question is, is there a cache like mechanism (DB in memory ?) making this worthy?


You have to remember that SQL Server or whatever DBMS you use will have its trade offs, but more importantly you have to think about how your application uses and serves this data to its clients to truly nail down what those trade offs will be. 

At my current job we use a 2 layer stack, the first and more permanent storage are SQL server database instances that are built on a CQRS (Command Query Read Segregation) model and the most recently/frequently accessed data is cached in RavenDB.

The cool thing that we implemented recently was segregating the data into "components" to get a higher probability of hitting something in our RavenDB store. An example of what I mean by this is, instead of storing the player's entire save in the cache, you store pieces of it (their items, pieces of their skill tree, etc) separately so when you go to fetch a different player's save, most of that information is already cached and can be rebuilt without even hitting the database. The power of permutations will play in your favor this way, allowing you to load 90% or more of players' information from the fast NoSQL cache instead of having to do costly queries on the SQL DB.

CQRS is basically a way to scale your database queries (reads) independently of your database write operations (writes/updates). Yes, tweets come in at large volumes, but 90% of the time a client is fetching data. This goes more in depth: http://martinfowler.com/bliki/CQRS.html

Hope this helps!

Edited by M6dEEp

Share this post


Link to post
Share on other sites

Well, for someone not went far from simple CRUD and have brief idea of NoSQL, seems this is slightly out of my scope atm not seem unachieveable though , so will keep this in mind but seems I'll postpone it until becomes a real issue ie worthing the trouble.

 

Thanks for caring to help though :)

Share this post


Link to post
Share on other sites

Good attitude. An important part of building a scalable system is knowing what is necessary now, and what you can improve later. None of the services that you mentioned started out needing to work at their current scale, and none of them would be here now if they tried to solve that problem initially. Generally, the biggest risk for newer projects is lack of players, not too many.

Edited by rip-off

Share this post


Link to post
Share on other sites

That or you just throw old data away. If you aren't actively deriving revenue from having it accessible, why design your game such that you have to store it?


I deeply disagree with this. Keep everything. There are a million things you might use the data for and once it's gone it's gone forever. You can always just ignore or archive old data you don't need. You can't fabricate data you threw away or never collected.

You can use this to collect important gameplay information on levels that's difficult to collect manually. You can use this to detect misbehaving players (and after, identify behaviors that help you detect them earlier by mining their history). You can analyze trends to further improve the game or marketing.

I highly suggest you read a text on data mining. It is essential in the games market; many game companies have entire teams of people whose job is just managing their data-mining needs.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!