Sign in to follow this  
Kylotan

MMOs and modern scaling techniques

Recommended Posts

VFe    120

I'm in a somewhat similar situation, I'm currently involved in design of scalable game systems. I come from a "big data" devops background, and have been facing many of the same issues. I have the same impression you got, that what us guys writing "large applications" do is completely alien to most game developers who only have game dev backgrounds.

 

 

My take-away so far:

 


  • The game itself being designed to be scalable is probably the biggest thing. GIGO. Some types of game-play are just inherently bad for scalability. So you're left with a choice of latency vs scalability.

  • In many cases, there's ways to increase the complexity of events to break them apart to increase their scalability. In situations where this is asymptotically impractical or impossible, you need to design out the local machine state. That is to say, if events must be processed sequentially, design them so that the sequential elements can be processed on any server. Borrow from REST where you can. The trade-off here is you're often increasing your processing or network over-head as much as 3x...but honestly, it's often worth it. It's counter-intuitive, especially for game developers to design purposefully wasteful systems.

  • Another problem for game devs in particular, OOP(design and principle, not necessarily the languages), is the enemy. It requires an extreme paradigm shift, the more state you can design away, the more scalability you get.


 

We're using an system very similar to what you're describing, asynchronous message passing across geographically diverse clusters(implemented in Clojure and Erlang, if anyones curious, with Riak as our back-end database), for large parts of our game system. And it works really well.

 

I'd say you can use these methods for just about any game system outside FPS, fighting games, maybe racing. Anything that is super latency sensitive is going to end up being a bad choice, but for everything else, I think it's a major win.

 

 

I guess I am generally interested in whether we're missing any tricks from web and app development, by being stuck in our ways and developing servers in much the same way we did 10 years ago. For example, some people are suggesting holding all data in persistent storage and manipulating it via memcached, and on a similar line Cryptic once insisted that they needed every change to characters to hit the DB (resulting in them writing their own database to make this feasible), rather than the usual "change in memory, serialise to DB later" approach. But do these methods pay off?

 

 

This imho is actually a really powerful method for games, similar to how we're doing things. Where every action is recorded to persistent storage(Riak), and then other players are essentially reading other players writes in a sense. With the memcache or application layer managing it.

 

Depending on the DB setup, this can be fast enough for real-time, but as with anything it's a tradeoff ala CAP theorem. For us, we decided that a little bit more latency in exchange for more scalability was a worthwhile tradeoff.

Edited by VFe

Share this post


Link to post
Share on other sites
Kylotan    9875

Your best bet is to just look at documentation and GDC presentations from companies already doing single-shard MMOs, like CCP (with EVE Online).

 

EVE's architecture - as far as I can tell - is not really any different from most of the others, in that it's geographically partitioned (specifically, one process per Solar System). They have some pretty hefty hardware on the back-end for persistence, which presumably is why they don't need to run multiple shards. I'm guessing that they don't have a great need for low latency either, which helps.

 

I don't know any other single-shard MMOs that are of a significant size; I'd be interested to learn of them (and especially of their architecture).

Edited by Kylotan

Share this post


Link to post
Share on other sites
Kylotan    9875


Some types of game-play are just inherently bad for scalability.

 

Sure. My hypothesis is that the traditional MMORPG is bad for scalability. Lots of actions depend on being able to query, and conditionally modify, more than one entity simultaneously. If I want to trade gold for an NPC's sword, how do we do that without being able to lock both characters? It's not an intractable problem - there are algorithms for coordinating trades between 2 distributed agents - but they are 10x more complex to implement than if a single process had exclusive access to them both.

 

(The flippant answer is usually to delegate this sort of problem to the database; but while this can make the exchange atomically, it doesn't help you much with ensuring that what is in memory is consistent, unless you opt for basically not storing this information in memory, which brings back the latency problem... etc)

 


every action is recorded to persistent storage(Riak), and then other players are essentially reading other players writes in a sense. With the memcache or application layer managing it.

 

This sounds interesting, but I would love to hear some insights into how complex multi-player interactions are implemented. Queries that can be high performance one-liners when the data is in memory are slow queries when you call out to memcache. And aren't there still potentially race conditions in these cases?

Share this post


Link to post
Share on other sites
hplus0603    11348

modern online games are already using web scaling approaches


There really are two kinds of things going on in most games.
One is the real-time component, where many other players see my real-time movements and my real-time actions.
The other is the persistent game state component, which is not that different from any other web services application.
The trick is that, if you try to apply web services approaches to the real-time stuff, you will fail, and if you try to apply the real-time approach to the web services stuff, you will pay way too much to build whatever you want to build.

Some projects have tried to unify these two worlds in one way or another (for example, Project Darkstar) but it's never really worked out well. In my opinion, doing that is a bit like trying to unify real-time voice communication with one-way streamed HD video delivery -- the requirements are so drastically different, it just doesn't make any sense to do that.
That analogy isn't entirely bad: One-way streamed video is a lot more like "web apps" than real-time voice communication, which is more like interactive games, except without the N-squared interactions and physics rules.

So, one way of approaching the scalability problem is to dumb down the gameplay until it's no longer a real-time, interactive, shared world. That doesn't make for fun games for those people who want those things. The other way is to be careful about only building the bits that you really do need for your desired gameplay, and optimizing those parts to the best of the state of the art.
Getting the game developers, and the web developers, to understand each other, and work under the same roof, AND making sure that the right tool is used for the right problem, while delivering the needed infrastructure for gameplay development, is really hard to pull off; I've seen many projects die because it misses one of those "legs" and doesn't even know what it's missing. (A bit of the Dunning-Kruger effect, coupled with "I've got a great hammer!")

Share this post


Link to post
Share on other sites
Washu    7829


EVE's architecture - as far as I can tell - is not really any different from most of the others, in that it's geographically partitioned (specifically, one process per Solar System).

Yes, one system per process, however they have the ability to move individual systems around to allow for extra processing power for when a system needs it, when you've got 7.5k players participating in a single battle in a single system... it requires a lot of processing power, the physics system alone takes a huge amount of time.

 


They have some pretty hefty hardware on the back-end for persistence, which presumably is why they don't need to run multiple shards.

The reason they don't run multiple shards is that it doesn't fit the game. The economy and the resources used to build everything is the core, multiple shards would defeat that purpose. As such it was architected around the idea of a single unified system.

 


I'm guessing that they don't have a great need for low latency either, which helps.

Up until the introduction of time dilation, latency and system resources was actually the major factor in determining the winner of a fight. When they switched out from standard blocking sockets to IOCP they reduced the system resource issue, but latency was still a major factor in determining who won a fight. Prior to TD and the IOCP switchup, the dreaded black screen of death (as it was known) was the major issue facing players in determining the winner of a fight. In essence, whomever got the most players into a system first... won. People with less latency (mainly brits and europeans) had a distinct advantage here, due to the location of the EVE data center.

 

I do recommend reviewing their GDC presentations, especially on their backend architecture, its quite flexible, now... it wouldn't necessarily work in a zoneless system, although I bet through some clever trickery and using player locality you could engineer a system of virtual boxes or spheres of players that overlapped and allowed you to grow or shrink on demand.

Share this post


Link to post
Share on other sites
snacktime    452

Ok so I'll chime in here.

 

Distributed archtiectures are the right tool for the job.  What I've seen is that the game industry is just not where most of the innovation with server side architectures is happening, so they are a bit behind.  That has been my observation from working in the industry and just looking around and talking to people.

 

When it comes to performance, it's all about architecture.  And the tools available on the server side have been moving fast in recent years.  Traditionally large multiplayer games weren't really great with architecture anyways, and for a variety of reasons they just kind of stuck with what they had.  This is changing, but slowly.

 

I think the game industry also had other bad influences like trying to equate the lessons learned on client side performance to the server. 

 

The secret to the current distributed systems is that almost all of them are based on message passing, usually using the actor model.  No shared state, messages are immutable, and there is no reliability at the message or networking level.  Threading models are also entirely different.  For example the platform I work a lot with, Aka, in simple terms passes actors between threads instead of locking each one to a specific thread.  They can use amazingly small thread pools to achieve high levels of concurrency.

 

What you get out of all that is a system that scales with very deterministic performance, and you have a method to distribute almost any workload over a large number of servers.

 

Another thing to keep in mind is that absolute performance usually matters very little on the server side.  This is something that is just counter intuitive for many game developers.  For an average request to a server, the response time difference between a server written in a slow vs a fast language is sub 1ms.  Usually it's in microseconds.  And when you factor in network and disk io latencies, it's white noise.  That's why scaling with commodity hardware using productive languages is common place on the server side.  The reason why you don't see more productive languages used for highly concurrent stuff is not because they are not performant enough, it's because almost all of them still have a GIL (global interpreter lock) that limits them to basically running on a single cpu in a single process. My favorite model now for being productive is to use the JVM but use jvm languages such as jruby or closure when possible, and drop to java only when I really need to.

 

For some of the specific techniques used in distributed systems, consistent hashing is a common tool.  You can use that to spread workloads over a cluster and when a node goes down, messages just get hashed to another node and things move on.

 

Handling things like transactions is not difficult I do it fairly often.  I use an actor with a FSM, and it handles the negotiation between the parties.  You write the code so all messages are idempotent, and from there it's straight forward.

 

Handling persistence in general is fairly straight forward on a distributed system.  I use Akka a lot, and I basically have a large distributed memory store based on actors in a distributed hash ring, backed by nosql databases with an optional write behind cache in between.  Because every unique key you store is mapped to a single actor, everything is serialized.  For atomic updates I use an approach that's similar to a stored procedure.  Note that I didn't say this was necessarily easy.  There are very few off the shelf solutions for stuff like this.  You can find the tools to make it all, but you have to wire it up yourself.

 

Having worked on large games before, my recent experiences with distributed systems has been very positive.  A lot of it comes down to how concurrency is handled.  Having good abstractions for that in the actor model makes so many things simpler.  Not to say there are not any challenges left.  You hit walls with any system, I'm just hitting them much later now with almost everything.

Share this post


Link to post
Share on other sites
hplus0603    11348

Distributed archtiectures are the right tool for the job. What I've seen is that the game industry is just not where most of the innovation with server side architectures is happening, so they are a bit behind.


You are absolutely correct that game developers often don't even know that they don't know how to do business back ends effectively. Game technologies in general, and real-time simulations in particular, are like race cars running highly tuned engines on a highly controlled race track. Business systems are like trucks carrying freight over a large network of roads. Game developers, too often, try to carry freight on race cars.

Race cars are great at what they do, though. That's the bit that is fundamentally different from the kind of business object, web architectures that you are talking about. (And I extend that to not-just HTTP; things like Storm or JMS or Scalding also fits in that mold.) When you say that "the optimization of a single server doesn't matter," it tells me that you've never tried to run a real physics simulation of thousands of entities, that can all interact, in real time, at high frame rates. There are entire classes of game play and game feel that are available now, that were not available ten years ago, explicitly because you can cram more processing power and higher memory throughput into a single box. A network is, by comparison, a high-latency, low-bandwidth channel. If you're not interested in those kinds of games, then you wouldn't need to know about this, but the challenge we discuss here is games where this matters just as much as the ability to provide consistent long-term data services to lots of users.

I see, equally often, business-level engineers, and web engineers, who are very good at what they do, failing to realize the demands of game-type realtime interactive simulation. (Or, similarly, multimedia like audio -- check out the web audio spec for a real train wreck some time.)

A really successful, large scale company, has both truckers and race car drivers, and get them to talk to each other and understand each other's unique challenges. Edited by hplus0603

Share this post


Link to post
Share on other sites
Kylotan    9875

This is basically the discussion that led me to post here - two sides both basically saying "I've done it successfully, and you can't really do it the way that the other people do it". Obviously this can't be entirely true. smile.png  I suspect there is more to it.

 

Let me ask some more concrete questions.

 

 

 


No shared state, messages are immutable, and there is no reliability at the message or networking level.

 

Ok - but there is so much in game development that I can't imagine trying to code in this way. Say a player wants to buy an item from a store. The shared state model works like this:

BuyItem(player, store, itemID):
    Check player has enough funds, abort if not
    Check itemID exists in store inventory, abort if not
    Deduct funds from player
    Add funds to store
    Add instance of itemID to player inventory
    Commit player funds and inventory to DB
    Notify player client and any observing clients of purchase

This is literally a 7-line function if you have decent routines already set up. Let's say you have to do it in a message-passing way, where the store and player are potentially in different processes. What I see - assuming you have coroutines or some other syntactic sugar to make this look reasonably sequential rather than callback hell - is something like this:

BuyItem(player, store, itemID):
    Check player has enough funds, abort if not
    Ask store if it has itemID in store inventory. Wait for response.
    If store replied that it did not have the item in inventory:
        abort.
    Check player STILL has enough funds, abort if not
    Deduct funds from player
    Tell store to add funds. Wait for response.
    If store replied that it did not have the item in inventory:
        add funds to player
        abort
    Add instance of itemID to player inventory
    Commit player funds to DB
    Notify player client and any observing clients of purchase

This is at least 30% longer (not counting any extra code for the store) and has to include various extra error-checks, which are going to make things error-prone. I suspect it gets even more complex when you try and trade items in both directions because you need both sides to be able to place items in escrow before the exchange (whereas here, it was just the money).

 

So... is there an easier or safer way I could have written this? I wouldn't even attempt this in C++ - without coroutines it would be hard to maintain the state through the routine. I suppose some system that allows me to perform a rollback of an actor would simplify the error-handling but there are still more potential error cases than if you had access to both sides of the trade and could perform it atomically.

 

You talk about "using an actor with a FSM", but I can't imagine having to write an FSM for each of the states in the above interaction. Again, comparing that to a 7-line function, it's hard to justify in programmer time, even if it undoubtedly scales further. I appreciate something like Akka simplifies both the message-passing and the state machine aspects, so there is that - but it's still a fair bit of extra complexity, right? (Writing a message handler for each state, swapping the message handler each time, stashing messages for other states while you do so, etc.)

 

Maybe you can generalise a bit - eg. make all your buying/selling/giving/stealing into one single 'trade' operation? Then at least you're not writing unique code in each case.

 

As for "writing the code so all messages are idempotent" - is that truly practical? I mean, beyond the trivial but worthless case of attaching a unique ID to every message and checking that the message hasn't been already executed, of course. For example, take the trading code above - if one actor wants to send 10 gold pieces to another, how do you handle that in an idempotent way? You can't send "add 10 gold" because that will give you 20 if the message arrives twice. You can't send "set gold to 50" because you didn't know the actor had 40 gold in the first place.

 

Perhaps that is not the sort of operation you want to make idempotent, and instead have the persistent store treat it as a transaction. Fair enough, and the latency wouldn't matter if you only do this for things that don't occur hundreds of times per second and if your language makes it practical. (But maybe there aren't all that many such routines? The most common one is movement, and that is easily handled in an idempotent way, certainly.)

 

Forgive my ignorance if there is a simple and well-known answer to this problem; it's been a while since I examined distributed systems on an academic level.

Edited by Kylotan

Share this post


Link to post
Share on other sites
ApochPiQ    23005
There's a lot of confusion and mistruth surrounding the way successful MMOs tend to be implemented. This can be attributed to any number of causes, but I think the bottom line is twofold:

1. Successful MMO developers know a lot more about distributed scale than the "hrrr drrr web-scale" crowd tends to realize.
2. Successful MMO developers rarely divulge all the secrets to their success. This feeds into Point 1.


These are hard problems, no doubt. It takes truly excellent engineering to solve them. People who claim to have found solutions but can't point to shipped and operational code are almost certainly in for a rude surprise when they actually try and put millions of players on their systems.

Scale is a very nonlinear thing. A lot of people intuit scale as being roughly linear... a hundred people is twice as many as fifty people, right? The fact is, you can get truly bizarre behavior with high-scalability systems. You might go from thousands of connections and 2% CPU usage and 90% free RAM to all of your hardware deadlocked and overheating by just adding a few dozen connections. You might get things that run faster by some metrics when loaded up with players. And so on.

Scalable distributed software is like quantum physics. If you think you can intuit what's going on, you're fucking delusional, period. This stuff is notoriously difficult and messy.


To (sort of, not really) answer the concrete questions that were posed earlier...

Yes, you need to make virtually all messages and transactions idempotent. The exceptions are fairly boring cruft, like pings and keep-alives and whatnot. But be careful if you assume that a ping/keep-alive isn't going to become a potential source of performance pain.

And no, you don't need to write a huge amount of extra code. Much of it can be abstracted and factored into reusable components fairly easily. You only need to write three-phase commit once, and then apply that routine to your transactions going forward; and so on. If you design carefully, you can eliminate a lot of the redundant cruft, and most of your code reads like a DSL that speaks in terms of messages and transactions.

That said, it does take writing it the hard way once or twice to learn where you can refactor effectively; I don't know of any shortcuts to that.

Share this post


Link to post
Share on other sites
snacktime    452

This is basically the discussion that led me to post here - two sides both basically saying "I've done it successfully, and you can't really do it the way that the other people do it". Obviously this can't be entirely true. smile.png  I suspect there is more to it.

 

Let me ask some more concrete questions.

 

 

 


No shared state, messages are immutable, and there is no reliability at the message or networking level.

 

Ok - but there is so much in game development that I can't imagine trying to code in this way. Say a player wants to buy an item from a store. The shared state model works like this:

BuyItem(player, store, itemID):
    Check player has enough funds, abort if not
    Check itemID exists in store inventory, abort if not
    Deduct funds from player
    Add funds to store
    Add instance of itemID to player inventory
    Commit player funds and inventory to DB
    Notify player client and any observing clients of purchase

This is literally a 7-line function if you have decent routines already set up. Let's say you have to do it in a message-passing way, where the store and player are potentially in different processes. What I see - assuming you have coroutines or some other syntactic sugar to make this look reasonably sequential rather than callback hell - is something like this:

BuyItem(player, store, itemID):
    Check player has enough funds, abort if not
    Ask store if it has itemID in store inventory. Wait for response.
    If store replied that it did not have the item in inventory:
        abort.
    Check player STILL has enough funds, abort if not
    Deduct funds from player
    Tell store to add funds. Wait for response.
    If store replied that it did not have the item in inventory:
        add funds to player
        abort
    Add instance of itemID to player inventory
    Commit player funds to DB
    Notify player client and any observing clients of purchase

This is at least 30% longer (not counting any extra code for the store) and has to include various extra error-checks, which are going to make things error-prone. I suspect it gets even more complex when you try and trade items in both directions because you need both sides to be able to place items in escrow before the exchange (whereas here, it was just the money).

 

So... is there an easier or safer way I could have written this? I wouldn't even attempt this in C++ - without coroutines it would be hard to maintain the state through the routine. I suppose some system that allows me to perform a rollback of an actor would simplify the error-handling but there are still more potential error cases than if you had access to both sides of the trade and could perform it atomically.

 

You talk about "using an actor with a FSM", but I can't imagine having to write an FSM for each of the states in the above interaction. Again, comparing that to a 7-line function, it's hard to justify in programmer time, even if it undoubtedly scales further. I appreciate something like Akka simplifies both the message-passing and the state machine aspects, so there is that - but it's still a fair bit of extra complexity, right? (Writing a message handler for each state, swapping the message handler each time, stashing messages for other states while you do so, etc.)

 

Maybe you can generalise a bit - eg. make all your buying/selling/giving/stealing into one single 'trade' operation? Then at least you're not writing unique code in each case.

 

As for "writing the code so all messages are idempotent" - is that truly practical? I mean, beyond the trivial but worthless case of attaching a unique ID to every message and checking that the message hasn't been already executed, of course. For example, take the trading code above - if one actor wants to send 10 gold pieces to another, how do you handle that in an idempotent way? You can't send "add 10 gold" because that will give you 20 if the message arrives twice. You can't send "set gold to 50" because you didn't know the actor had 40 gold in the first place.

 

Perhaps that is not the sort of operation you want to make idempotent, and instead have the persistent store treat it as a transaction. Fair enough, and the latency wouldn't matter if you only do this for things that don't occur hundreds of times per second and if your language makes it practical. (But maybe there aren't all that many such routines? The most common one is movement, and that is easily handled in an idempotent way, certainly.)

 

Forgive my ignorance if there is a simple and well-known answer to this problem; it's been a while since I examined distributed systems on an academic level.

 

Handling something like a transaction is really not that different in a distributed system.  All network applications deal with unreliable messaging, reliability and sequencing has to be added in somewhere.  Modern approaches put it at the layer that defined the need in the first place, as opposed to putting it into a subsystem and relying on it for higher level needs, which is just a leaky abstraction and an accident waiting to happen.

 

If a client wants to send 10 gold to someone and sends a request to do that, the client has no way of knowing if the request was processed correctly without an ack.  But the ack can be lost, so the situation where you might have to resend the same request is present in all networked applications.

 

Blocking vs non blocking is mostly an issue that comes up at scale.  For things that don't happen 100,000 times per second, you don't need to ensure that stuff doesn't block.

 

As for FSM's, I use them a lot but mostly because I can do them in ruby, and ruby DSL's I find easy to read and maintain.  That Akka state machine stuff I don't like, never used it.  Seems a lot more awkward then it needs to  be.   Things like purchasing items is something that's not in the hot path, I can afford to use more expressive languages for stuff like that.

Share this post


Link to post
Share on other sites
Kylotan    9875


1. Successful MMO developers know a lot more about distributed scale than the "hrrr drrr web-scale" crowd tends to realize.
2. Successful MMO developers rarely divulge all the secrets to their success. This feeds into Point 1.

 

And yet, pretty much every published example of MMO scaling seems to focus on the old-school methods. You'd think that, given how much has been said on the matter, that there would be at least one instance of people talking about using different methods, but I've not seen one. I was hoping someone on this thread would be able to point me in the right direction. Instead, I'm in much the same position as I was before I posted - people insist that newer methods are being used, but provide no citations. :)

 


Yes, you need to make virtually all messages and transactions idempotent.

 

I'd love to see an example of how to do this, given that many operations are most naturally expressed as changes relative to a previous state (which may not be known). I assume there is literature on this but I can't find it.

 


You only need to write three-phase commit once, and then apply that routine to your transactions going forward; and so on.

 

I suspected this might be the case but I am sceptical about the overhead in both latency and complexity. But then I don't have any firm evidence for either. :)

Share this post


Link to post
Share on other sites
Kylotan    9875

If a client wants to send 10 gold to someone and sends a request to do that, the client has no way of knowing if the request was processed correctly without an ack. But the ack can be lost, so the situation where you might have to resend the same request is present in all networked applications.

 

I think we're crossing wires a bit here. Reliable messaging is a trivial problem to solve (TCP, or a layer over UDP), and thus it is easy to know that either (a) the request was processed correctly, or will be at some time in the very near future, or (b) the other process has terminated, and thus all bets are off. It's not clear why you need application-level re-transmission. But even that's assuming a multiple-server approach - in a single game server approach, this issue never arises at all - there is a variable in memory that contains the current quantity of gold and you just increment that variable with a 100% success rate. Multiple objects? No problem - just modify each of them before your routine is done.

 

What you're saying, is that you willingly forgo those simple guarantees in order to pursue a different approach, one which scales to higher throughput better. That's fine, but these are new problems, unique to that way of working, not intrinsic to the 'business logic' at all. With 2 objects co-located in one process you get atomicity, consistency, and isolation for free, and delegate durability to your DB as a high-latency background task.

Edited by Kylotan

Share this post


Link to post
Share on other sites
snacktime    452

 


If a client wants to send 10 gold to someone and sends a request to do that, the client has no way of knowing if the request was processed correctly without an ack. But the ack can be lost, so the situation where you might have to resend the same request is present in all networked applications.

 

I think we're crossing wires a bit here. Reliable messaging is a trivial problem to solve (TCP, or a layer over UDP), and thus it is easy to know that either (a) the request was processed correctly, or will be at some time in the very near future, or (b) the other process has terminated, and thus all bets are off. It's not clear why you need application-level re-transmission. But even that's assuming a multiple-server approach - in a single game server approach, this issue never arises at all - there is a variable in memory that contains the current quantity of gold and you just increment that variable with a 100% success rate. Multiple objects? No problem - just modify each of them before your routine is done.

 

What you're saying, is that you willingly forgo those simple guarantees in order to pursue a different approach, one which scales to higher throughput better. That's fine, but these are new problems, unique to that way of working, not intrinsic to the 'business logic' at all. With 2 objects co-located in one process you get atomicity, consistency, and isolation for free, and delegate durability to your DB as a high-latency background task.

 

 

So this is an interesting topic actually.  The trend is to move reliability back up to the layer that defined the need in the first place, instead of relying on a subsystem to provide it.  

 

Just because the network layer guarantees the packets arrive doesn't mean they get delivered to the business logic correctly, or processed correctly.   If you think 'reliable' udp or tpc makes your system reliable, you are lying to yourself.

 

http://www.infoq.com/articles/no-reliable-messaging

http://web.mit.edu/Saltzer/www/publications/endtoend/endtoend.txt

http://doc.akka.io/docs/akka/2.3.3/general/message-delivery-reliability.html

Share this post


Link to post
Share on other sites
hplus0603    11348
First: Try writing a robust real-time physics engine that can support even a small number like 200 players with vehicles on top of the web architecture that @snacktime and @VFe describe. I don't think that's the right tool for the job.

Second:

You'd think that, given how much has been said on the matter, that there would be at least one instance of people talking about using different methods, but I've not seen one.


Personally, I've actually talked a lot about this in this very forum for the last ten-fifteen years. For reference, the first one I worked on was There.com, which was a full-on single-instance physically-based virtual world. It supported full client- and server-side physics rewind; a procedural-and-customized plane the size of Earth; fully customizable/composable avatars with user-generated content commerce; voice chat in world; vehicles ridable by multiple players, and a lot of other things; timeframe about 2001. The second one (where I still work) is IMVU.com where we eschew physics in the current "room based" experience because it's so messy. IMVU.com is written almost entirely on top of web architecture for all the transactional stuff, and on top of a custom low-latency ephemeral message queue (written in Erlang) for the real-time stuff. Most of that's sporadically documented in the engineering blog: http://engineering.imvu.com/

Share this post


Link to post
Share on other sites
fir    460

cool thread, though i feel to much incompetent to say something here,

 

can someone say to me what pings (ping times) has to do it with this?

Is the ping times the source of most troubles or problem is in something different? Is there some perspective that ping times will drop noticably in the 

www in the future? can someone say how the range of them is in todays world? How responsible is usually an average connection between a player and a game serwer i mean time to player --?--> serwer , then processing and time to read other side data?

Share this post


Link to post
Share on other sites
hplus0603    11348

Ping times are a source of complexity and gameplay challenges, but they are not a source of scalability problems.

 

Ping times for wired connections will not drop dramatically in the future, because they are bound by the speed of light -- current internet is already within a factor of 50% of the speed of light, so the maximum possible gains are quite well bounded.

Share this post


Link to post
Share on other sites
fir    460

Ping times are a source of complexity and gameplay challenges, but they are not a source of scalability problems.

 

Ping times for wired connections will not drop dramatically in the future, because they are bound by the speed of light -- current internet is already within a factor of 50% of the speed of light, so the maximum possible gains are quite well bounded.

 

does the relativity efects show in the gameplay ? ;/

 

i understand that "showing" of the problems is the response delay

that server give to the client, and the art is to keep it low below some treshold.. could maybe someone know how the treshold is (is this a sum 

of time of sending info to server + server processing time + time of sending response to the client?) how values are this to feel tha game is really fine?

 

(sorry for basic questions in more advanced thread but if I get an opportunity i wold like to understand things a bit, maybe also some thoughts will appear ;/)

Share this post


Link to post
Share on other sites
Waterlimon    4398

 

Ping times are a source of complexity and gameplay challenges, but they are not a source of scalability problems.

 

Ping times for wired connections will not drop dramatically in the future, because they are bound by the speed of light -- current internet is already within a factor of 50% of the speed of light, so the maximum possible gains are quite well bounded.

 

does the relativity efects show in the gameplay ? ;/

 

i understand that "showing" of the problems is the response delay

that server give to the client, and the art is to keep it low below some treshold.. could maybe someone know how the treshold is (is this a sum 

of time of sending info to server + server processing time + time of sending response to the client?) how values are this to feel tha game is really fine?

 

(sorry for basic questions in more advanced thread but if I get an opportunity i wold like to understand things a bit, maybe also some thoughts will appear ;/)

 

 

Since there is always a delay in the information passing through the server (the ping), in order to not show it to the player, you need to either:

-Show the server answer to the player after a set time X.

**Play some animation of length X before showing the player whether you succeeded in something or not (which is dictated by server)

**The player cannot tell the lag because the only way to tell lag is the time it takes for the server to answer, and we have hidden that information

**This breaks down if the ping gets higher than X

-Predict the servers answer accurately

**This can probably never work to 100% accuracy, but it can be used to hide most of the lag resulting from the ping

**Eg predict actions of other players at current time, although you cannot know this until X seconds later because this info comes from the server (and thus has delay)

**You can also predict the result of your own actions if an authoritative response is required from server. Eg if you open a chest, the client can ASSUME that theres nothing in there (with the clients luck), which is most often correct, to hide the ping, but this might be wrong and then has to be corrected when the 'real' information is available.

 

So you can either hide the lag of obtaining information, or predict the information before obtaining it.

Share this post


Link to post
Share on other sites
Tribad    981

 
does the relativity efects show in the gameplay ? ;/
 

Absolutly. Mostly because of mass increase of the data packages starting to be significant at about one third of lightspeed. If you throw a stone onto another player it will magically produce more hitpoints. This is why many people change place into countries that are closer to the game servers.

Share this post


Link to post
Share on other sites
fir    460

@up 

 

Im curious what are the response times of todays internet infrastructure,

 

assume such model: world and a 1000 players no it

 

there is NOW frame, everything is perfect ok on the map, each player moves and sends its new position to the server, it takes various times to them +10 +30 +70 ms (i dont know)

When server will receive the last position we can count the FUTURE frame (also perfect) then w send this future state back to the players

 

now we can do it again

 

(this is a kind of my imaginary model but probably can be realized)

 

this kind of working will "pulse" with the frequenzy of more laggish player,

so lets take an approach that when some most lagish players will delay

after some treshold we can throw them off out the game

 

I wonder how theshold in milliseconds can be set that will allow to alive at least a half of most speed connections? (I got no idea as i am not programming network or even not playing network games)

 

It would be interesting to me estimate this time. I worry that if each sending times has some variation (i mean like gaus 10+-300) this killing connection approach will kill most of the players - but wold very like to 

get some idea how many players will be survive in such a test

 

(i know games uses some techniques that allow masking delays etc 

but would be curious how much it wold take in raw perfect sate of things)

 

has someone as they say some "intuition" how many players would stay alive at which delay treshold here?

Edited by fir

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this