• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
pronpu

How Galaxy, an In-Memory Data Grid for MMO Games, Works

15 posts in this topic

Hi.
A [url="http://blog.paralleluniverse.co/post/28062434301/galaxy-internals-part-1"]new blog post[/url] explains the internal workings of [url="http://puniverse.github.com/galaxy/"]Galaxy[/url], an open-source, in-memory data grid for MMO games. Edited by pronpu
0

Share this post


Link to post
Share on other sites
Has to handle transaction rollback to prevent deadlock/livelock.

Failure protection via master-node redundancy and cohesive sync imaging are needed as well.

Row sizing policy for different data types to give the system a hint on how large to make the 'cache rows' also would be a useful addition.
0

Share this post


Link to post
Share on other sites
Maybe I'm getting something wrong, but I don't understand the scalability claims. The way the system relies on broadcasts both to enforce ordering and to fetch cache lines means that it won't scale better than your network. It looks to me as if you have moved the serialization bottleneck of a single CPU/memory bus, into a single Ethernet broadcast network.

Unfortunately, Ethernet networks are slower than memory busses. I would expect that, for data limited workloads, the system you propose will top out SOONER than an equivalent system built using unified memory.

This might be very similar to the problems that hampered and ultimately killed Project Darkstar, and that have prevented most Tuple Space implementations from gaining much traction.

I may be missing something though. Do you know of a real-world system using a similar mechanism that has actually scaled well? What previous experience made you build the system the way it is now?

0

Share this post


Link to post
Share on other sites
[quote name='wodinoneeye' timestamp='1343364708' post='4963512']
Has to handle transaction rollback to prevent deadlock/livelock.
[/quote]

Done. See [url="http://puniverse.github.com/galaxy/manual/api/api-store.html#transactions"]docs[/url].

[quote name='wodinoneeye' timestamp='1343364708' post='4963512']
Failure protection via master-node redundancy and cohesive sync imaging are needed as well.
[/quote]

Done. See this [url="http://blog.paralleluniverse.co/post/28635713418/how-galaxy-handles-failures"]blog post[/url].

[quote name='wodinoneeye' timestamp='1343364708' post='4963512']
Row sizing policy for different data types to give the system a hint on how large to make the 'cache rows' also would be a useful addition.
[/quote]

This is not required as cache rows are dynamically sized (each row contains one object).
0

Share this post


Link to post
Share on other sites
[quote name='hplus0603' timestamp='1343382753' post='4963583']
Maybe I'm getting something wrong, but I don't understand the scalability claims. The way the system relies on broadcasts both to enforce ordering and to fetch cache lines means that it won't scale better than your network. It looks to me as if you have moved the serialization bottleneck of a single CPU/memory bus, into a single Ethernet broadcast network.
[/quote]

That's not the way galaxy works. Broadcasts are very rare, and are only required the first time a line is requested.

[quote name='hplus0603' timestamp='1343382753' post='4963583']
This might be very similar to the problems that hampered and ultimately killed Project Darkstar, and that have prevented most Tuple Space implementations from gaining much traction.
[/quote]

Project Darkstar never got to the point of supporting multiple nodes. Galaxy was designed precisely to target issues with tuple-spaces. Tuple spaces use a distributed-hash-table requiring 1 network hop in the common case (and in the worst case). Galaxy requires 0 network hops in the common case (for the price of more than one network hops worst-case).

See [url="http://blog.paralleluniverse.co/post/26909672264/on-distributed-memory"]here[/url] for the rational behind Galaxy's architecture.

Note that Galaxy is just a distribution layer. Its real power comes from distributed data structures built on top of it. I will soon write a blog post at highscalability.com analyzing the efficiency of a well-behaving Galaxy data-structure.
0

Share this post


Link to post
Share on other sites
[quote name='pronpu' timestamp='1344504691' post='4967691']That's not the way galaxy works. Broadcasts are very rare, and are only required the first time a line is requested.
[/quote]

Evenso, Ethernet is slower than RAM.

[quote]Project Darkstar never got to the point of supporting multiple nodes.[/quote]

Actually, they did, but the scalability coefficient was negative. One of the reasons why I think trying to run "shared state" systems over networks is almost always the wrong solution.

0

Share this post


Link to post
Share on other sites
[quote name='hplus0603' timestamp='1344527262' post='4967820']
Evenso, Ethernet is slower than RAM.
[/quote]

I'm not quite sure what you mean. The whole point behind Galaxy is [i]reducing[/i] network access and making sure most operations access RAM only. Galaxy is meant to use the network [i]less[/i] than other IMDGs/centralized DBs, not more.

[quote name='hplus0603' timestamp='1344527262' post='4967820']
Actually, they did, but the scalability coefficient was negative.
[/quote]

I don't want this to turn into a factual argument, but this is simply not true. They did run a simple experiment where they used their single-node logic in a multi-node setting (I think simply by accessing a single BerkeleyDB instance without any caching) and got negative scalability as expected. Full multi-node operation, however, was never implemented. Here's a question asked on the Red Dwarf (Darkstar's successor after Oracle had shut down Darkstar) forums on Mar 29, 2011, and answered by cyberqat, Darkstar's lead developer:
Q: Is it possible with RedDwarf server part to connect multiple servers, so that they work as a single logical unit ? I think its about clustering and load balancing but I'm not sure...
A: What you are referring to is "multi-node" operation. This is designed and intended, but not yet fully implemented.

[quote name='hplus0603' timestamp='1344527262' post='4967820']
One of the reasons why I think trying to run "shared state" systems over networks is almost always the wrong solution.
[/quote]

That's the big question. The first thing to understand, though, is what is meant by "shared state". After all, all shared state solutions, even the shared RAM inside a single machine, are implemented using message passing. Messages are always sent as a result of contention, so contention always leads to communication which increases latency. If you have a conceptual shared state where no contention ever occurs (suppose you have a world divided into 4 quadrants distributed to 4 machines, but, by pure chance, no object ever crosses the quadrant borders and objects across quadrants don't see or affect each other), then no messages will be ever passed and you'll have perfect linear scalability. On the other hand, if you ever need to scale any data processing (be it a game or any other application) beyond the capabilities of a single node, than the problem domain may absolutely require message passing if more than one node would ever require accessing the same data item. So contention can sometimes never be eliminated fully, simply by the nature of your problem.
Now, whether you use "message passing" or "shared state", contention can and will occur, and will invariably increase latency. It may be true that if you use the shared-state concept to model your problem, you may be drawn to increase contention simply because it is better hidden by the model than in the message-passing idiom. However, I'd like to claim that for some problems, the concept of shared state is much more natural. If you have one big game-world with lots of objects, that simply cannot be simulated by a single machine, you will need to break it up into pieces somehow, and being able to still treat it as a single world will be quite natural. In order to scale, though, you will still need to reduce contention. If you build your distributed world-data-structure just right, Galaxy will help you reduce contention, and decrease the number of messages passed among nodes, so that nearly all operations are executed on a single node. Edited by pronpu
0

Share this post


Link to post
Share on other sites
[quote name='pronpu' timestamp='1344544255' post='4967904']
[quote name='hplus0603' timestamp='1344527262' post='4967820']
Evenso, Ethernet is slower than RAM.
[/quote]

I'm not quite sure what you mean. The whole point behind Galaxy is [i]reducing[/i] network access and making sure most operations access RAM only. Galaxy is meant to use the network [i]less[/i] than other IMDGs/centralized DBs, not more.
[/quote]

My point is that IMDGs and centralized DBs is the wrong approach. You want to physically shard in some smart way, and reduce interactions between shards, for actual scalability.

[quote][quote name='hplus0603' timestamp='1344527262' post='4967820']
Actually, they did, but the scalability coefficient was negative.
[/quote]

I don't want this to turn into a factual argument, but this is simply not true. They did run a simple experiment
[/quote]

Interesting! That goes against what Jeff what's-his-face told me in person at GDC a few years ago, and also said publicly in their forums. I'm not surprised at all, though, as "Darkstar" seems to have been 90% marketing fluff anyway.

[quote]If you build your distributed world-data-structure just right, Galaxy will help you reduce contention, and decrease the number of messages passed among nodes, so that nearly all operations are executed on a single node.
[/quote]

That sounds like a very reasonable and balanced statement that I can believe!
0

Share this post


Link to post
Share on other sites
[quote name='hplus0603' timestamp='1344586520' post='4968006']
You want to physically shard in some smart way, and reduce interactions between shards, for actual scalability.
[/quote]

Well, I believe that sufficiently advanced technology can both automate manual tasks, making them easier [i]and[/i] doing a better job in most cases, while also making what's been previously impossible or extremely difficult, quite possible if not easy. Otherwise, what's the use of computer science? Edited by pronpu
0

Share this post


Link to post
Share on other sites
The problem, as I see it, is that, with physical sharding, you end up forcing interactions onto a RAM bus, which is fast.
With "object soup" -- multiple overlying objects realms, or a big shared state domain -- at least some of the interactions will be over a network, and a network is 100x slower than a memory bus, or worse.
Thus, if only 1% of your interactions are across a network, you become network limited instead of RAM limited. For some topologies, the allowable interaction fraction is even smaller, such as 0.1%.

Now, consider a "radar" that wants to scan all objects within a particular geographic area. Suppose that each player has a radar. This means that you will, by necessity, transfer information about most players, to most players. This means that your network interaction fraction will quite possibly go over the 1% limit, and thus your overall networked system is more constrained than if you had just fit it all into RAM. There are many other use cases where you get similar interactions; radar is just the most obvious.

You can work around this, by building a "radar manager" that collects local radar information into a single data item, that is then shared with others, so you scale by O(radar managers) instead of O(players) -- in effect, you're micro-sharding the particular use case of radars. Then you end up with inefficiencies in the amount of data shared, because the memory-optimizing systems generally aren't distributed according to locality. After a while of doing this for each and every kind of subsystem, though, you start longing for the straightforward days of physical sharding :-)

A system for distributed, scalable interactions I like better, is one where you are only allowed to depend on the state of other objects from one step ago. This means that everyone simulates their next step based on the previous state of anything they interact with. Then, state is exchanged between entities that depend on other entities (this is a networked graph operation.) This still may become network limited on the amount of object state, but the limitation is very manageable, because you can manage exactly what goes where, when, and it's done at a very precise and well-defined point in time (and space).

Anyway, I look forward to Galaxy going down this discovery path, and coming up with "just right" data structures to minimize the cost of network limitations, and I look forward to seeing each step shared on this forum :-)
0

Share this post


Link to post
Share on other sites
[quote name='hplus0603' timestamp='1344684062' post='4968385']
Thus, if only 1% of your interactions are across a network, you become network limited instead of RAM limited. For some topologies, the allowable interaction fraction is even smaller, such as 0.1%.
[/quote]

I wrote a [url="http://highscalability.com/blog/2012/8/20/the-performance-of-distributed-data-structures-running-on-a.html"]blog post on highscalability.com[/url] today analyzing the performance of a data structure distributed on top of Galaxy. It proves that if you use a data structure that fits well with Galaxy's design, you'll get far, far fewer than 0.1% of the transactions would require any network hops.
0

Share this post


Link to post
Share on other sites
[quote] if you use a data structure that fits well with Galaxy's design[/quote]

I also have no argument with that. The question is whether you can implement the actual gameplay you want with that data structure.
0

Share this post


Link to post
Share on other sites
@pronpu, great post at HS. I have one question though - many scenarios would require dealing with border objects (items that are in one node, but they are spatially close to items in another node so as to cause interaction) so won't that require network IOs? Even if we assume those type of items are shared across the two nodes, updating them frequently would result in network IO isn't it? Edited by andyaustin
0

Share this post


Link to post
Share on other sites
[quote name='andyaustin' timestamp='1345582213' post='4971972']
updating them frequently would result in network IO isn't it?
[/quote]

Well, after an object is updated (once or multiple time), a read from any other node would require a network roundtrip. Just items close to one another "across the border" won't cause network IO, even if clients need to see items on both sides. You can just send the data to the client from both nodes.
It's only when the items interact that inter-node IO would be required. Now, if one node initiates a transaction, all items in that transactions are transferred to that node (to be "owned" by it), and then they stay there. So, for example, if a character stands right on the border and touches an object on side, that object will be transferred to the character's node; if the character then touches an object on the other side, this, too, will be migrated to the same node, and any further interaction with those two objects will no longer require network IO.

So, potentially you'll have a problem if two characters stand on either side of the border, kicking, say, a ball between them back and forth. But how fast could that happen? Would your game allow them to pass the ball hundreds of time a second?
0

Share this post


Link to post
Share on other sites
[quote name='pronpu' timestamp='1345683489' post='4972439']
[quote name='andyaustin' timestamp='1345582213' post='4971972']
updating them frequently would result in network IO isn't it?
[/quote]

Well, after an object is updated (once or multiple time), a read from any other node would require a network roundtrip. Just items close to one another "across the border" won't cause network IO, even if clients need to see items on both sides. You can just send the data to the client from both nodes.
It's only when the items interact that inter-node IO would be required. Now, if one node initiates a transaction, all items in that transactions are transferred to that node (to be "owned" by it), and then they stay there. So, for example, if a character stands right on the border and touches an object on side, that object will be transferred to the character's node; if the character then touches an object on the other side, this, too, will be migrated to the same node, and any further interaction with those two objects will no longer require network IO.

So, potentially you'll have a problem if two characters stand on either side of the border, kicking, say, a ball between them back and forth. But how fast could that happen? Would your game allow them to pass the ball hundreds of time a second?
[/quote]
Thanks for the explanation, pronpu! I'm thinking in the lines of a fast paced game (let's say it is similar to Quake, but not exactly a FPS). If for some reason two players are handled by two nodes, and they need to know about the other's position (they both are running around so their positions update many times per second, and the game's engine needs to do some processing on these objects), then with Galaxy the nodes would need to read data from each other everytime the position changes if I understand it correctly. I was wondering if there are some standard techniques to reduce this communication.
0

Share this post


Link to post
Share on other sites
[quote name='andyaustin' timestamp='1345694030' post='4972473']
If for some reason two players are handled by two nodes, and they need to know about the other's position... then with Galaxy the nodes would need to read data from each other everytime the position changes if I understand it correctly. I was wondering if there are some standard techniques to reduce this communication.
[/quote]

Well, if the two characters just [i]see[/i] each other, then there's no reason why each wouldn't get updates about the other from the other's node. If they [i]interact[/i] then the way Galaxy works, after the first interaction they would both be handled on the same node. So the scenario you describe could really only happen if the two interact indirectly, say by passing a ball between them back and forth. In that case, the ball would migrate from one node to the other and back. That would entail one network roundtrip for each pass. But how fast could the two characters pass the bal between them? Edited by pronpu
0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0