Jump to content

  • Log In with Google      Sign In   
  • Create Account

Kylotan

Member Since 08 Mar 2000
Offline Last Active Sep 09 2014 06:43 AM

#5159891 MMOs and modern scaling techniques

Posted by Kylotan on 11 June 2014 - 05:30 PM


If a client wants to send 10 gold to someone and sends a request to do that, the client has no way of knowing if the request was processed correctly without an ack. But the ack can be lost, so the situation where you might have to resend the same request is present in all networked applications.

 

I think we're crossing wires a bit here. Reliable messaging is a trivial problem to solve (TCP, or a layer over UDP), and thus it is easy to know that either (a) the request was processed correctly, or will be at some time in the very near future, or (b) the other process has terminated, and thus all bets are off. It's not clear why you need application-level re-transmission. But even that's assuming a multiple-server approach - in a single game server approach, this issue never arises at all - there is a variable in memory that contains the current quantity of gold and you just increment that variable with a 100% success rate. Multiple objects? No problem - just modify each of them before your routine is done.

 

What you're saying, is that you willingly forgo those simple guarantees in order to pursue a different approach, one which scales to higher throughput better. That's fine, but these are new problems, unique to that way of working, not intrinsic to the 'business logic' at all. With 2 objects co-located in one process you get atomicity, consistency, and isolation for free, and delegate durability to your DB as a high-latency background task.




#5159889 MMOs and modern scaling techniques

Posted by Kylotan on 11 June 2014 - 05:08 PM


1. Successful MMO developers know a lot more about distributed scale than the "hrrr drrr web-scale" crowd tends to realize.
2. Successful MMO developers rarely divulge all the secrets to their success. This feeds into Point 1.

 

And yet, pretty much every published example of MMO scaling seems to focus on the old-school methods. You'd think that, given how much has been said on the matter, that there would be at least one instance of people talking about using different methods, but I've not seen one. I was hoping someone on this thread would be able to point me in the right direction. Instead, I'm in much the same position as I was before I posted - people insist that newer methods are being used, but provide no citations. :)

 


Yes, you need to make virtually all messages and transactions idempotent.

 

I'd love to see an example of how to do this, given that many operations are most naturally expressed as changes relative to a previous state (which may not be known). I assume there is literature on this but I can't find it.

 


You only need to write three-phase commit once, and then apply that routine to your transactions going forward; and so on.

 

I suspected this might be the case but I am sceptical about the overhead in both latency and complexity. But then I don't have any firm evidence for either. :)




#5159834 MMOs and modern scaling techniques

Posted by Kylotan on 11 June 2014 - 12:47 PM

This is basically the discussion that led me to post here - two sides both basically saying "I've done it successfully, and you can't really do it the way that the other people do it". Obviously this can't be entirely true. smile.png  I suspect there is more to it.

 

Let me ask some more concrete questions.

 

 

 


No shared state, messages are immutable, and there is no reliability at the message or networking level.

 

Ok - but there is so much in game development that I can't imagine trying to code in this way. Say a player wants to buy an item from a store. The shared state model works like this:

BuyItem(player, store, itemID):
    Check player has enough funds, abort if not
    Check itemID exists in store inventory, abort if not
    Deduct funds from player
    Add funds to store
    Add instance of itemID to player inventory
    Commit player funds and inventory to DB
    Notify player client and any observing clients of purchase

This is literally a 7-line function if you have decent routines already set up. Let's say you have to do it in a message-passing way, where the store and player are potentially in different processes. What I see - assuming you have coroutines or some other syntactic sugar to make this look reasonably sequential rather than callback hell - is something like this:

BuyItem(player, store, itemID):
    Check player has enough funds, abort if not
    Ask store if it has itemID in store inventory. Wait for response.
    If store replied that it did not have the item in inventory:
        abort.
    Check player STILL has enough funds, abort if not
    Deduct funds from player
    Tell store to add funds. Wait for response.
    If store replied that it did not have the item in inventory:
        add funds to player
        abort
    Add instance of itemID to player inventory
    Commit player funds to DB
    Notify player client and any observing clients of purchase

This is at least 30% longer (not counting any extra code for the store) and has to include various extra error-checks, which are going to make things error-prone. I suspect it gets even more complex when you try and trade items in both directions because you need both sides to be able to place items in escrow before the exchange (whereas here, it was just the money).

 

So... is there an easier or safer way I could have written this? I wouldn't even attempt this in C++ - without coroutines it would be hard to maintain the state through the routine. I suppose some system that allows me to perform a rollback of an actor would simplify the error-handling but there are still more potential error cases than if you had access to both sides of the trade and could perform it atomically.

 

You talk about "using an actor with a FSM", but I can't imagine having to write an FSM for each of the states in the above interaction. Again, comparing that to a 7-line function, it's hard to justify in programmer time, even if it undoubtedly scales further. I appreciate something like Akka simplifies both the message-passing and the state machine aspects, so there is that - but it's still a fair bit of extra complexity, right? (Writing a message handler for each state, swapping the message handler each time, stashing messages for other states while you do so, etc.)

 

Maybe you can generalise a bit - eg. make all your buying/selling/giving/stealing into one single 'trade' operation? Then at least you're not writing unique code in each case.

 

As for "writing the code so all messages are idempotent" - is that truly practical? I mean, beyond the trivial but worthless case of attaching a unique ID to every message and checking that the message hasn't been already executed, of course. For example, take the trading code above - if one actor wants to send 10 gold pieces to another, how do you handle that in an idempotent way? You can't send "add 10 gold" because that will give you 20 if the message arrives twice. You can't send "set gold to 50" because you didn't know the actor had 40 gold in the first place.

 

Perhaps that is not the sort of operation you want to make idempotent, and instead have the persistent store treat it as a transaction. Fair enough, and the latency wouldn't matter if you only do this for things that don't occur hundreds of times per second and if your language makes it practical. (But maybe there aren't all that many such routines? The most common one is movement, and that is easily handled in an idempotent way, certainly.)

 

Forgive my ignorance if there is a simple and well-known answer to this problem; it's been a while since I examined distributed systems on an academic level.




#5159617 MMOs and modern scaling techniques

Posted by Kylotan on 10 June 2014 - 04:10 PM


Some types of game-play are just inherently bad for scalability.

 

Sure. My hypothesis is that the traditional MMORPG is bad for scalability. Lots of actions depend on being able to query, and conditionally modify, more than one entity simultaneously. If I want to trade gold for an NPC's sword, how do we do that without being able to lock both characters? It's not an intractable problem - there are algorithms for coordinating trades between 2 distributed agents - but they are 10x more complex to implement than if a single process had exclusive access to them both.

 

(The flippant answer is usually to delegate this sort of problem to the database; but while this can make the exchange atomically, it doesn't help you much with ensuring that what is in memory is consistent, unless you opt for basically not storing this information in memory, which brings back the latency problem... etc)

 


every action is recorded to persistent storage(Riak), and then other players are essentially reading other players writes in a sense. With the memcache or application layer managing it.

 

This sounds interesting, but I would love to hear some insights into how complex multi-player interactions are implemented. Queries that can be high performance one-liners when the data is in memory are slow queries when you call out to memcache. And aren't there still potentially race conditions in these cases?




#5159613 MMOs and modern scaling techniques

Posted by Kylotan on 10 June 2014 - 03:59 PM

Your best bet is to just look at documentation and GDC presentations from companies already doing single-shard MMOs, like CCP (with EVE Online).

 

EVE's architecture - as far as I can tell - is not really any different from most of the others, in that it's geographically partitioned (specifically, one process per Solar System). They have some pretty hefty hardware on the back-end for persistence, which presumably is why they don't need to run multiple shards. I'm guessing that they don't have a great need for low latency either, which helps.

 

I don't know any other single-shard MMOs that are of a significant size; I'd be interested to learn of them (and especially of their architecture).




#5159543 MMOs and modern scaling techniques

Posted by Kylotan on 10 June 2014 - 11:01 AM


If I remember correctly, City of Heroes actually would spin up additional instances of "the same area" when player counts got too high. We also did that in There.com, to support large parties. This is good from a player point of view, too; you'd rather see all the players you can interact with, than having 1000 players in the same area and you can only see the nearest three meters...

 

Yeah, I alluded to that at the end of my 3rd paragraph. To be honest I hate the 'instancing' approach, but it does help you scale and some players prefer it (eg. so that they and their friends get a dungeon all to themselves).

 

 

 


Web applications do not have nearly the level of coupling between different business objects that games do.

 

I am inclined to agree. However I met someone yesterday who vehemently claimed that modern online games are already using web scaling approaches - though, when pressed, he was unable or unwilling to name a game that does this and which has a large amount of shared state, citing only that League of Legends has 500K concurrent players. But as we know, that is actually 50K concurrent games each with 10 players - quite a different problem.

 

 

 


[...]the best option I could come up with would be one where objects are only allowed to affect other objects one tick into the future[...]

I've seen similar approaches used in threaded systems to reduce contention, and it seems to work well, if you are able to get all the logic right. I am still concerned about algorithms such as trading, however. Is there an easy way to conduct atomic trades in such a system that is impossible to exploit? I am aware (though not too familiar) with concepts like three-phase commits for such purposes, but I suspect that writing all events that affect 2 or more entities using such a system would be too complex.

 

I guess I am generally interested in whether we're missing any tricks from web and app development, by being stuck in our ways and developing servers in much the same way we did 10 years ago. For example, some people are suggesting holding all data in persistent storage and manipulating it via memcached, and on a similar line Cryptic once insisted that they needed every change to characters to hit the DB (resulting in them writing their own database to make this feasible), rather than the usual "change in memory, serialise to DB later" approach. But do these methods pay off?




#5159497 MMOs and modern scaling techniques

Posted by Kylotan on 10 June 2014 - 07:26 AM

(NB. I am using MMO in the traditional sense of the term, ie. a shared persistent world running in real-time, not in the modern broader sense, where games like Farmville or DOTA may have a 'massive' number of concurrent players but there is little or no data that is shared AND persistent AND updating in real-time.)

 

In recent discussions with web and app developers one thing has become quite clear to me - the way they tend to approach scalability these days is somewhat different to how game developers do it. They are generally using a purer form of horizontal scaling - fire up a bunch of processes, each mostly isolated, communicating occasionally via message passing or via a database. This plays nicely with new technologies such as Amazon EC2, and is capable of handling 'web-scale' amounts of traffic - eg. clients numbering the the tens or hundreds of thousands - without problem. And because the processes only communicate asynchronously, you might start up 8 separate processes on an 8-core server to make best use of the hardware.

 

In my experience of MMO development, this is not how it works. There is a lot of horizontal scaling, but instead of firing up servers on demand, we pre-allocate them and tend to divide them geographically - both in terms of real world location so as to be closer to players, and in terms of in-game locations, so that characters that are co-located also share the same game process. This would seem to require more effort on the game developer's part but also imposes several extra limitations, such as making it harder to play with friends located overseas on different shards, requiring each game server to have different configuration and data, etc. Then there is the idea of 'instancing' a zone, which could be thought of as another geographical partition except in an invisible 4th dimension (and that is how I have implemented it in the past).

 

MMOs do have a second trick up their sleeves, in terms of it being common to farm out certain tasks to various heterogeneous servers. A typical web app might just have many instances of the front-end server and one database (possibly with some cache servers in between), but in my experience MMOs will often have specific servers for handling authentication, chat and communications, accounts and transactions, etc. It's almost like extreme refactoring; if a piece of functionality can run asynchronously from the gameplay then it can be siphoned out into a new server and messaging to and from the game server set up accordingly.

 

But in general, MMO game servers are limited in their capacity, so that you can typically only get 500-1500 players in one place. You can change the definition of 'place' by adding instancing and shards, you can make the world seem to hold more characters by seamlessly linking servers together at the boundaries, and you can increase concurrency a bit more via farming out tasks to special servers.

 

So I wonder; are we doing it wrong? And more specifically, can we move to a system of homogeneous server nodes, created on demand, communicating via message passing, to achieve a larger single-shard world?

 

Partly, the current MMO server architecture seems to be born out of habit. What started off as servers designed to accommodate a small number of people grew and grew until we have what we see today - but the underlying assumption is that a game server should (in most cases) be able to take a request from a client, process it atomically and synchronously, and alter the game state instantly, often replying at the same time. We keep all game information in RAM because that is the only way we can effectively handle the request synchronously. And we keep all co-located entities in the same RAM because that's the only way we can easily handle multiple-entity transactions (eg. trading gold for items). But does this need to be the case?

 

My guess is that the main reason we can't move to a more distributed architecture comes partly down to latency but mostly down to complexity. If characters exist across an arbitrary number of servers, any action involving multiple characters is going to require passing messages to those other processes and getting all the responses back before proceeding. This turns behaviour that used to be a single function into either a coroutine (awkward in C++) or some sort of callback chain, also requiring error-detection (eg. if one entity no longer exists by the time the messages get processed) and synchronisation (eg. if one entity is no longer in a valid state for the behaviour once all the data is collected). This seems somewhat intractable to me - if what used to be a simple piece of functionality is now 3 or 4 times as complex, you're unlikely to get the game finished. And will the latency be too high? For many actions, I expect not, but for others, I fear it would.

 

But am I wrong? Outside of games people are writing large and complex applications using message queues and asynchronous behaviour. My suspicion is that they can do this because they don't have a large amount of shared state (eg. world and character data). But maybe it's because they know ways to accomplish these tasks that somehow the game development community has either not become aware of or simply not been able to implement yet.

 

Obviously there have been attempts to mix the two ideas, by running many homogeneous servers but attempting to co-locate all relevant data on demand so that the actual work can be done in the traditional way, by operating atomically on entities in RAM. On paper this looks like a great solution, with the only problem being that it doesn't seem to work in practice. (eg. Project Darkstar and various offshoots.) Sending the entities across the network so that they can be operated on appears to be like trying to send the mountain to Mohammed rather than him going to the mountain (ie. sending the message to the entity). What you gain in programming simplicity you lose in serialisation costs and network latency. A weaker version of this would be automatic geographical load balancing, I suppose.

 

So, I'd like to hear any thoughts on this. Can we make online games more amenable to an async message-passing approach? Or are there fundamental limitations at play?




#5159480 Your most valuable debugging techniques for Networked games?

Posted by Kylotan on 10 June 2014 - 06:16 AM

I may as well chime in with some late observations.

  • sending the wrong packet data to the server.
  • the server unpackaging this data incorrectly.

As already mentioned, these 2 are perfect candidates for unit testing. Every single part of your serialisation and deserialisation code should be trivial to write tests for and therefore you can be in a position where you are strongly confident that there are no bugs here.

 

Another idea I like to implement in debug builds is to immediately unpack any packet I'm about to send and verify that the unpacked version matches the packed version, using an assert. This catches several types of serialisation bug before the data even hits the wire.

  • the server sending the wrong data to the clients.

Obviously this would be a general logic error, but one way to reduce this problem is to limit the variety of data that can be sent. Some games create hundreds of different packets, one for every piece of functionality. If you can reduce these disparate messages into a much smaller set, then there is less scope for error. My MMO code has no more than roughly 10 distinct message types, and although each of those has some subtypes, they all pass through the same few basic functions and are easy to check for errors.

  • the server applying the data in the wrong way.

Again, this is a general logic error, no different from doing the wrong thing in any code. But if you have reduced the number of distinct messages then it would seem harder to do that.

  • the clients unpackaging the data incorrectly.
  • the clients applying the data in the wrong way.

Ideally your clients and servers use the same code, so if you fix the problem at one end, you'll fix it at both. But if you can't share the same code - and I know what this is like, having worked on an MMO with a Python server but clients written in C# and C++ - just try to keep the code as similar as possible and unit test as much as you can.




#5077165 Free / Open source multiplayer servers

Posted by Kylotan on 12 July 2013 - 12:27 PM

I get the impression you are looking for something higher-level than the solutions being offered to you. Something like Smartfox Server or Photon would serve, but unfortunately they aren't free.




#5077112 How to implement game music?

Posted by Kylotan on 12 July 2013 - 09:53 AM

Usually the final game will use an mp3 or an ogg but you should still be creating a lossless format like a wav initially. If they ask for a different format, you can convert it to the format of their choosing. The loss in sound quality is usually negligible but that's the user's decision to make.




#5077110 PyGame, Cocos2d or PyGlet?

Posted by Kylotan on 12 July 2013 - 09:50 AM

Problem 1: others have answered this - most systems won't handle transitions automatically. You have to change the value yourself over time, or find someone else's code to do it for you.

 

Problem 2: you can calculate delta time yourself by calling pygame.time.get_ticks each frame. Cache the value of the previous call and check the difference each time. In pyglet, you choose how often you want an update function to get called, which means you will know the delta time value in advance. Here's the way to schedule an update - http://www.pyglet.org/doc/programming_guide/calling_functions_periodically.html

 

Problem 3: in Pygame, it's http://www.pygame.org/docs/ref/key.html#pygame.key.get_pressed and in pyglet it's http://www.pyglet.org/doc/programming_guide/keyboard_events.html#remembering-key-state




#5077105 Organizing Source Files

Posted by Kylotan on 12 July 2013 - 09:36 AM

nvm I figured it out. Never heard of $(SolutionDir) before. And after some playing around over in Additional Include Directories, this did the trick so I can put my headers in whatever folders I want within my solution directory:

$(SolutionDir)Source Code\Header Files

Thats all I was trying to ask for, and it was a pain to find in Google for this answer. But thanks for trying guys.

 

The reason that's hard to find, is because it's usually a bad idea to do that.

 

Firstly, if you organise your code into several different directories, you usually want those directories to form part of the #include line. Otherwise you'll run into trouble if you ever end up with similarly-named headers in different directories.

 

Secondly, you usually want files to be found relative to the project, not the solution - because projects are designed to be shared across multiple solutions. If you use $(SolutionDir) then that is going to break if you attempt to reuse that project in a different solution. You'd have to copy everything across, which means you lose many of the benefits of sharing code.




#5076746 Organizing Source Files

Posted by Kylotan on 10 July 2013 - 06:13 PM

Visual Studio lets you put the normal source files (eg. .cpp) wherever you want. If the project contains the file, it will be built.

 

If however it says that it can't find an include file (eg. .h), that's more complex. There is a process that the compiler follows in order to resolve #include lines and you need to adhere to this. Generally speaking, if your header file is not in the same directory as whichever file is #including it, you need to either spell out the relative path in the #include line, or you need to add the directory to the project settings (under 'Additional Include Directories', if I remember correctly).




#5074996 In-game purchasing systems better then 'pay for download' approach?

Posted by Kylotan on 03 July 2013 - 04:55 AM

Not sure what sort of discussion you are hoping for. This is a pretty well-covered topic. Of course, many people agree with you. That's why many games with microtransactions exist. And some people disagree with them being 'better' because they don't want to be hounded for cash during a game. It's obvious that there are pros and cons to both, surely.

 

This is also wrongly made out to look like a new invention, when in fact games have done it for decades. Doom, released in 1993, was 'free to play', but you had to pay to unlock the last two thirds of the game. There wasn't an in-app purchase (because that wasn't practical back then), but the principle of making it easy to play before you pay is neither new or unusual.

 

Some voices in favour of more free-to-play games:

http://www.gamesbrief.com/2013/05/why-i-havent-bought-frozen-synapse-on-ipad-for-4-99-yet/

https://plus.google.com/105363132599081141035/posts/Cyi2Am8gqGq ("Coercive Pay-2-Play techniques")

 

And some voices in favour of a traditional buy-once model:

http://www.gamesbrief.com/2013/05/why-frozen-synapse-costs-money/

http://www.computerandvideogames.com/356036/features/the-five-biggest-problems-with-free-to-play-gaming/




#5074750 Unity Network.Destoy problem.

Posted by Kylotan on 02 July 2013 - 09:07 AM

It does appear to be a significant flaw in Unity's networking model, to be honest.

 

http://answers.unity3d.com/questions/227723/prevent-players-from-using-networkdestroy.html

http://forum.unity3d.com/threads/138412-Clients-can-call-Network-Destroy

http://forum.unity3d.com/threads/138437-What-do-security-conscious-people-do-for-multiplayer-networking-x-post-r-Unity3D

 

Switching to an external solution is probably the best bet.






PARTNERS