Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!


1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


ZeroMQ in games?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
9 replies to this topic

#1 magicstix   Members   -  Reputation: 191

Like
0Likes
Like

Posted 11 December 2011 - 07:19 PM

Has anyone here ever worked with ZeroMQ as a networking layer in their game?

If so, I'm somewhat curious as to their thoughts on the experience. We've been considering switching to 0MQ from MPI at work for our large-scale cluster simulations, but reading on the API I'm thinking it might also make for a good layer for gaming.

Sponsor:

#2 samoth   Crossbones+   -  Reputation: 5859

Like
0Likes
Like

Posted 12 December 2011 - 02:58 AM

I found ZMQ very interesting at first, but apart from the annoyingly cool documentation (it's so cool, it's almost sub-zero), its rigid patterns also always happen to do almost exactly what you want, just never exactly. Your mileage may vary, but for me it's not particularly useful.

#3 AllEightUp   Moderators   -  Reputation: 4638

Like
0Likes
Like

Posted 12 December 2011 - 07:18 AM

Has anyone here ever worked with ZeroMQ as a networking layer in their game?

If so, I'm somewhat curious as to their thoughts on the experience. We've been considering switching to 0MQ from MPI at work for our large-scale cluster simulations, but reading on the API I'm thinking it might also make for a good layer for gaming.


I have used ZMQ for internal tools which process millions of pieces of data but that is as far as I would go with it. Unfortunately with missing items such as the ability to determine when connect/disconnect happen, source of connection, reason for disconnect etc, there are simply too many limitations in a more dynamic environment. As a quick testbed for various idea's it is very useful though, and also in terms of massive scaling of tools and such. For instance, it was a very effective solution to scaling up an asset dependency scanner which runs on a mammoth box, at one point it took 8 hours to do it's job, it currently runs in under 6 minutes by bringing up multiple processes and multiple threads in each process, ZMQ was quite effective in making this work. Though, to be honest, even there, the rigid structure of ZMQ made some things which should have been very simple, a pain in the butt.

So, for dynamic systems I don't really suggest it.

#4 fholm   Members   -  Reputation: 262

Like
0Likes
Like

Posted 12 December 2011 - 02:17 PM

I've been investigating ZMQ for communicating between zones and nodes in my server, I just started on the implementation today and so far the features seem great. For the client<->server communication I'm sticking with a standard UDP library tho.

#5 AllEightUp   Moderators   -  Reputation: 4638

Like
0Likes
Like

Posted 12 December 2011 - 08:45 PM

I've been investigating ZMQ for communicating between zones and nodes in my server, I just started on the implementation today and so far the features seem great. For the client<->server communication I'm sticking with a standard UDP library tho.


Don't get me wrong, the basic design idea's are very solid. The problem is when you start adding game related requirements such as bringing up unique instanced data, dealing with instance failures, kicking folks back to entry with a "oops, it crashed message" etc. Due to the desire to hide connection details, ZMQ is a real pain in the ass in these areas unfortunately. If I wanted to write a network layer, I'd start with the ZMQ model at the low level but I would add the things they are most definitely apposed to supporting. Long story short: love the concepts, hate the idealistically imposed limitations.

#6 fholm   Members   -  Reputation: 262

Like
0Likes
Like

Posted 13 December 2011 - 04:22 AM


I've been investigating ZMQ for communicating between zones and nodes in my server, I just started on the implementation today and so far the features seem great. For the client<->server communication I'm sticking with a standard UDP library tho.


Don't get me wrong, the basic design idea's are very solid. The problem is when you start adding game related requirements such as bringing up unique instanced data, dealing with instance failures, kicking folks back to entry with a "oops, it crashed message" etc. Due to the desire to hide connection details, ZMQ is a real pain in the ass in these areas unfortunately. If I wanted to write a network layer, I'd start with the ZMQ model at the low level but I would add the things they are most definitely apposed to supporting. Long story short: love the concepts, hate the idealistically imposed limitations.


I can definitely see the problem of using ZMQ as a Client<->Server library, with it's "connection-less" API. But I don't see the inherent problems with using it as a library for doing inter-server communication, like when transfering players from one server to another, etc. As this doesn't really have to be aware of any connections, since you trust the environment servers can just send their "id" or whatever is used to identify them as the header of any message.

I also really like that ZMQ can "bind" and "connect" to any socket, so you don't have to "bind" the "hosting" socket and "connect" the "connecting" socket, basically the model I have right now is insanely simple:
  • Each node in the cluster has two sockets, one SUB (in-going) and one PUB (out-going) sockets which are bound to two pre-defined ports based on the node number
  • When a node needs to send data to another node, it opens a PUB-socket directly to that nodes SUB-socket
  • When a node needs to receive data from another node, it attaches it's own SUB socket to that nodes PUB socket.
I can't really see any inherent drawbacks in this, but I'm sure someone will point them out ;p

#7 AllEightUp   Moderators   -  Reputation: 4638

Like
0Likes
Like

Posted 19 December 2011 - 08:23 PM



I've been investigating ZMQ for communicating between zones and nodes in my server, I just started on the implementation today and so far the features seem great. For the client<->server communication I'm sticking with a standard UDP library tho.


Don't get me wrong, the basic design idea's are very solid. The problem is when you start adding game related requirements such as bringing up unique instanced data, dealing with instance failures, kicking folks back to entry with a "oops, it crashed message" etc. Due to the desire to hide connection details, ZMQ is a real pain in the ass in these areas unfortunately. If I wanted to write a network layer, I'd start with the ZMQ model at the low level but I would add the things they are most definitely apposed to supporting. Long story short: love the concepts, hate the idealistically imposed limitations.


I can definitely see the problem of using ZMQ as a Client<->Server library, with it's "connection-less" API. But I don't see the inherent problems with using it as a library for doing inter-server communication, like when transfering players from one server to another, etc. As this doesn't really have to be aware of any connections, since you trust the environment servers can just send their "id" or whatever is used to identify them as the header of any message.

I also really like that ZMQ can "bind" and "connect" to any socket, so you don't have to "bind" the "hosting" socket and "connect" the "connecting" socket, basically the model I have right now is insanely simple:
  • Each node in the cluster has two sockets, one SUB (in-going) and one PUB (out-going) sockets which are bound to two pre-defined ports based on the node number
  • When a node needs to send data to another node, it opens a PUB-socket directly to that nodes SUB-socket
  • When a node needs to receive data from another node, it attaches it's own SUB socket to that nodes PUB socket.
I can't really see any inherent drawbacks in this, but I'm sure someone will point them out ;p


It is a workable solution but the down side is how ZMQ works, I believe with a non-udp pub socket if nothing is listening eventually the write will block because nothing has been pulling data off the socket. I know this is the case for the push/pull sockets but could be wrong about the pub, I didn't find pub as interesting for my needs. Anyway, this causes a single failure to start a chain reaction and the entire system to come to a screeching halt as data gets backed up and the write operations start blocking. So, the real problem lies in the underlying "hide the faults" methodology because it doesn't allow for fault tolerance in the system very well. So, while I used this for a couple items and it is fairly provably high performance, I can't really suggest it in larger systems until they stop hiding important events. For the systems they are tuning it for, seems like a great solution, it just didn't work for my more dynamic desires unfortunately.

#8 samoth   Crossbones+   -  Reputation: 5859

Like
0Likes
Like

Posted 20 December 2011 - 05:24 AM

I've been investigating ZMQ for communicating between zones and nodes in my server [...]
But I don't see the inherent problems with using it as a library for doing inter-server communication


The problem I see is that it is sometimes too smart when you don't need it, and not smart enough when you do. For example, say you want to balance load over your server cluster to support many thousand players. For that you divide your "world" into spatially indepenent chunks and process each one as a task.

You don't want to hardcode this, but want it to be flexible, so you can add more nodes if more computer power is needed. Fine with ZMQ so far.

However, you don't want it to be too flexible. You don't want to give tasks relating to the same chunk to a different node every time. This would mean you must replicate all immutable state on every node, or with every task. Bang you're dead. Unless you use 1-to-1 sockets and do your scheduling manually, in which case you could as well use BSD sockets, there is no way to hint that certain messages should go to the same nodes as certain previous ones.

You want it to be reliable, that is you want to be sure messages are delivered and processed, and replies are received. ZMQ will reliably deliver messages, and then they are gone. If you use a pattern that requires a reply, it will require that reply.

If the other node receives the message and crashes, you're without luck. Message gone and no way to recover. This may be acceptable for some applications where a lost transaction does not matter much or where a "try again" screen works. In other situations, it doesn't work. Think of lost a fight to some total n00b because the servers suck or lost the Epic Frying Pan of Goblin Slaying after someone spent 3 hours waiting for people to get ready and half an hour to kill Groomzash the Stinker. These are customer support and promotional nightmares. 20-page angry troll threads at a gamer forum and half a dozen support queries for you is not what you want to see on a regular base just because messages are lost when a node goes down. On the other hand, if the other node doesn't want to send a reply because it has nothing to say, you're equally without luck. It must, or ZMQ will keep a record of it forever.

Then there's the problem with being zealous about being "faster than TCP". This can be desirable, and it cannot. ZMQ is not really faster than TCP anyway, that's a lie. What it does is send out all messages that are queued whenever the descriptor gets ready, with Nagle's algorithm disabled. This is nice because it has lower latency than vanilla TCP, but it causes many extra packets sent in some situations and very unfavourable workload distribution in some other situations, especially at startup. The latter can be worked around, but that's not the point.

I'm not saying that ZMQ is bad, but ... it's not like everything is just golden and just works magically. It's not like you plug in ZMQ and all your problems are solved.

#9 hplus0603   Moderators   -  Reputation: 7280

Like
0Likes
Like

Posted 20 December 2011 - 11:08 AM

However, you don't want it to be too flexible. You don't want to give tasks relating to the same chunk to a different node every time. This would mean you must replicate all immutable state on every node, or with every task. Bang you're dead. Unless you use 1-to-1 sockets and do your scheduling manually, in which case you could as well use BSD sockets, there is no way to hint that certain messages should go to the same nodes as certain previous ones.


Given the primitive of a message queue, you can create a queue per listening area, and address messages when sending them. This means that each sender needs to know where each message needs to go -- but *someone* needs to know this, and it's probably better that your game knows about spatial partitioning than a general-purpose messaging system.

However, this is also why games often end up re-inventing the wheel. Games are often very special-purpose, especially for cases where small-order performance matter (1000 vs 100 players on a server matters for gameplay reasons, say). Until computers are fast enough to simulate everything a game will need at arbitrary scale, this will remain a tension in game design and implementation.
enum Bool { True, False, FileNotFound };

#10 AllEightUp   Moderators   -  Reputation: 4638

Like
0Likes
Like

Posted 20 December 2011 - 09:46 PM


I've been investigating ZMQ for communicating between zones and nodes in my server [...]
But I don't see the inherent problems with using it as a library for doing inter-server communication


The problem I see is that it is sometimes too smart when you don't need it, and not smart enough when you do. For example, say you want to balance load over your server cluster to support many thousand players. For that you divide your "world" into spatially indepenent chunks and process each one as a task.

You don't want to hardcode this, but want it to be flexible, so you can add more nodes if more computer power is needed. Fine with ZMQ so far.

However, you don't want it to be too flexible. You don't want to give tasks relating to the same chunk to a different node every time. This would mean you must replicate all immutable state on every node, or with every task. Bang you're dead. Unless you use 1-to-1 sockets and do your scheduling manually, in which case you could as well use BSD sockets, there is no way to hint that certain messages should go to the same nodes as certain previous ones.

You want it to be reliable, that is you want to be sure messages are delivered and processed, and replies are received. ZMQ will reliably deliver messages, and then they are gone. If you use a pattern that requires a reply, it will require that reply.

If the other node receives the message and crashes, you're without luck. Message gone and no way to recover. This may be acceptable for some applications where a lost transaction does not matter much or where a "try again" screen works. In other situations, it doesn't work. Think of lost a fight to some total n00b because the servers suck or lost the Epic Frying Pan of Goblin Slaying after someone spent 3 hours waiting for people to get ready and half an hour to kill Groomzash the Stinker. These are customer support and promotional nightmares. 20-page angry troll threads at a gamer forum and half a dozen support queries for you is not what you want to see on a regular base just because messages are lost when a node goes down. On the other hand, if the other node doesn't want to send a reply because it has nothing to say, you're equally without luck. It must, or ZMQ will keep a record of it forever.

Then there's the problem with being zealous about being "faster than TCP". This can be desirable, and it cannot. ZMQ is not really faster than TCP anyway, that's a lie. What it does is send out all messages that are queued whenever the descriptor gets ready, with Nagle's algorithm disabled. This is nice because it has lower latency than vanilla TCP, but it causes many extra packets sent in some situations and very unfavourable workload distribution in some other situations, especially at startup. The latter can be worked around, but that's not the point.

I'm not saying that ZMQ is bad, but ... it's not like everything is just golden and just works magically. It's not like you plug in ZMQ and all your problems are solved.


While I agree with the general comments, I think it splits into three areas:

1. Message routing. Zmq 'kinda' allows this but it is not really setup or intended to do this. In the case of grid'ing the world, you need to do considerable work anyway at properly interconnecting things as required and it is generally unusual enough that no "out of the box" solution will work well based on whatever spatial partitioning you want to use.

2. Transactional behaviors, i.e. send a message to a crashed target, the message is lost. Once again, Zmq doesn't give you any help here and this is where I find the inflexibility of the design motivations highly annoying. But, there is nothing directly stopping you from coding this in. The obvious way to do it is to use one of the bidirectional connection types but don't do that since they are setup as single message out, single message back state engines which would burn enormous amounts of wasted latency/bandwidth.

3. Lack of interaction with the network layer itself. This is the part I was more focused on in terms that you simply don't know when something quits working without writing a separate heartbeat socket. It can be done but if I had connect/disconnect feedback I wouldn't need to write something separate.


So, given the breakdown, I'd say #1 is completely acceptable. I'm likely to want direct control over that as the way I want to divide things up changes game to game. The #2 item is pretty annoying and almost a game stopper for me as I am all about soft failures instead of taking down an entire cluster due to "one" piece failing. (Unless it is constantly failing and preventing anything from working of course. That's surely a hard failure.) #3 was my big show stopper unfortunately. Without the connect/disconnect feedback I was unable to use Zmq in my cluster system. My entire system is 100% dynamic such that I don't care where something is in the cluster of boxes, all I need to know is that it will be starting "somewhere", I'll get a notification when it is coming up and when it is ready at which point I connect. Of course if any of those items timeout, I need to know about it and either try again or inform the player of problems. Timeouts are hidden, Zmq keeps retrying forever once told where to look. Oops, fail...

In regards to #3, I basically wrote a Zmq like layer as an experiment (just a quick test under boost::asio, nothing I'd care to share as it is horribly nasty and only emulated one socket type), but I applied a reactor pattern to the connect/disconnect/error side of things. You create the socket wrapper as in Zmq and then call a new api function to connect "signals" to the sockets for connect/disconnect/error/oob whatever. (I was thinking in terms of the *nix signals interface for this, per socket.) Now you proceed to write everything in the normal Zmq like single threaded manner and those callbacks get called only from the thread which in turn calls the normal Zmq send/receive/etc functions. It is a reactor pattern with multiple event dispatch sources, but retains all the things I really did like about the Zmq base model. The same solution can also work for "allowing" transactional items to still work within the strict patterns.

It was an interesting experiment which I intend to revisit at some time. But hopefully the idea of how the change could work, without changing the basic paradigm, can give you some ideas.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS