• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
Pranav Sathyanarayanan

UDP Replication

14 posts in this topic

Hey Guys,

 

So new question. I have implemented a simple ZoneServer in C++ and I am curious as to how to go about handling my ONRPG player replication. So currently, what I do is this:

 

1) Create and bind a UDP socket to my desired port on the server

2) Wait for incoming data using recvfrom, and when it is received a new thread is created passing that information into it.

3) The thread determines the information within the packet, and if it is a request for a position update, determines the position and starts a new thread "replicate".

4) The replicate thread then replicates the new users position to all clients.

 

For those who are familiar with Unity, (which is what I am using for my client), I simply have a thread listening for packets and push them to a buffer, and in the main Update loop of my NetworkController I process ~4000 packets from the top of the queue at a time, and Lerp my remote players to their desired position.

 

The issue is, although there is no lag for the player, there becomes a lot of lag for the remote players on each client when over 3-4 people are connected on the server. Is there any way I can improve my server end?

Edited by PranavSathy
0

Share this post


Link to post
Share on other sites

 Wow hplus, this is like invaluable, so I am in the process of ridding myself of my threads, so in this case my question becomes then that last function, send_packet_to_all_players, is that ONE packet that has information about all players in the zone?

0

Share this post


Link to post
Share on other sites

Wow hplus, this is like invaluable, so I am in the process of ridding myself of my threads, so in this case my question becomes then that last function, send_packet_to_all_players, is that ONE packet that has information about all players in the zone?


Send as few packets as possible. You can put multiple messages into a single packet. Each object's position update will often be a single message. Maybe it'll be combined with orientation and velocity data or maybe a single message will include information about a group of objects. It all depends on your game (there is not a single correct answer).

You do not generally need or want to send information about an entire zone to each player. If player A is standing 100 units away from player B and the in-game visibility is only 10 units, why would the players need to know anything about each other? This is generally referred to as "area of interest filtering." Figure out which players care about which objects and only send updates about those objects. This filtering can range from very simple radius checks to some very complex queries, depending on your needs.
0

Share this post


Link to post
Share on other sites

Right, thanks Sean! and hplus too!! I have implemented this new server, will conduct my tests tomorrow, hopefully my lag problem will be fixed, and hplus I will try and quantify the data if this does not work for you o.0. Just out of curiosity, why is multithreading looked down upon?

0

Share this post


Link to post
Share on other sites

Right, thanks Sean! and hplus too!! I have implemented this new server, will conduct my tests tomorrow, hopefully my lag problem will be fixed, and hplus I will try and quantify the data if this does not work for you o.0. Just out of curiosity, why is multithreading looked down upon?


Multithreading is not the problem here, the usage you described is the problem. Spinning up new threads is something to be avoided because it is exceptionally slow and costly. In your case, it is actually reasonable to use threads but you need to do it only as you determine it is needed and also you don't want to actually "start" threads, they should all exist and never shutdown, they simply idle until there is work.

If/when you get to the point where it makes sense, there are many ways to go about things but I tend to move straight to the OS specifics. There is very little point running your system multithreaded through the generic API's when WinIOCP, ePoll or KEvent API's do a significant portion of the thread communications for you and cut out a fair portion of the overhead involved. Of course, while epoll and kevents are fairly simple and give you great benefits, WinIOCP is a PITA to get correct. Either way though, when you need the threaded solution, the OS specifics cut out a lot of intermediate bits that allow you to reduce latency issues and maintain high performance. But, again, doing this is really only going to be valid for a pretty high amount of traffic/process, it is up to you to decide when to switch over.
0

Share this post


Link to post
Share on other sites
threads

 

In addition to what hplus0603 already said, you should generally never spawn threads, except when your program starts up. And then, you should spawn a fixed amount of them (typically equal to the number of CPU cores, or one less). Then assign tasks to the threads via one or several queues (lockfree ideally, but queues with a lock work just fine too, if you manage tasks properly). Note that when I say "task" then that does not mean something like "add two numbers", but something like "process 1,000 items".

 

The reason for that is that spawning threads is a lengthy, expensive operation which you do not want to do while receiving (or rather, not receiving, but dropping) UDP packets, and spawning a thread per task is generally a bad design, which is neither efficient nor scales well. Many threads means many context switches, which means a lot of CPU wasted for nothing.

 

You definitively want receiving UDP traffic to happen on one thread that does nothing else, since if you don't receive datagrams "immediately", your receive buffer will quickly fill up, and you will drop packets. So, you will definitively not want to do anything like reading a file, processing complicated game logic, or spawning threads in that same thread which handles the network. You don't need more than just one thread for that either, though. One thread is absolutely capable of doing that task (even with plain old blocking API calls and nothing special!).

 

ONE packet that has information about all players in the zone?

This depends. While that may be a very efficient solution (for some rare cases), it may not be the correct one. Every player in the zone (probably) does not see everything, but you are still sending every player the complete information. That may be acceptable, or it may be exploitable and therefore forbidding (depends on your game).

 

Also, not all information is equally important to each player. Depending on the amount of updates, you may have to send a considerable amount of data to every player. Bandwidth is not only limited (both at your end and at the other end!) but also costs money. You will therefore wish to reduce bandwidth by sending each player only

a) what they can actually see

b) what, in this subset, matters most

c) no more than a fixed so-and-so-much budget per second

 

It matters big time if someone who is 2 meters away makes a side step or changes clothes. This is immediately obvious. However, changing clothes may not be as important as pulling a gun.

It doesn't matter at all if someone 250 meters away makes a step or changes clothes. You likely won't notice at all.

 

Since the number of updates that you need to transmit scales at least quadratically with distance (according to the area of a disk for 2D/pseudo-3D, or if it's real 3D the volume of a sphere), you usually need to apply some "importance metric" that is somehow related to distance for each receiving user.

 

WinIOCP, ePoll or KEvent

This is an excellent tip for TCP, but less for UDP.  With TCP, you have many sockets in an "unknown" state, but you can only do one thing at a time, so you need some special magic that tells you which one is ready so you can read from (or which overlapped operation has just finished).

 

Using UDP, you have a single socket, no more. And that single socket either has something to read, or it doesn't. Instead of blocking on a function that tells you "it's now ready" and then calling recvfrom, you can just as well block on recvfrom, which will conveniently return when something came in. Then push that packet on a queue (making the processing someone else's problem!) and immediately call recvfrom  again.

Edited by samoth
0

Share this post


Link to post
Share on other sites
For a FPS game with smallish levels and smallish number of players (such as Quake) sending a single packet with information about all players for each network tick is totally fine, simple, and will perform well. You only need to generate the contents of this packet once per tick, too, which is a bonus.

When the number of players goes up (say, above 30) and the sizes of levels goes up (so not everybody can possibly snipe everybody) then you can start doing interest management, where "close" or "important" entities are sent every network tick, but "far" or "unimportant" entities are sent less often. These packets need to be generated differently for each player, because each player has a different viewpoint.

When it comes to threading, threads are great to use multiple CPU cores. Thus, the ideal program/system has exactly one thread per CPU core. To make sure that those threads always have work to do, you should be using some kind of notified, asynchronous, or non-blocking I/O, so that threads don't get stalled blocking on I/O. For things that don't have convenient asynchronous APIs, like file I/O on Linux, you can spin up an additional thread, which receives requests for I/O, performs the requests, and then responds back, basically implementing async I/O at user level. You'd use some kind of queue between the other threads posting requests, and responses getting queued.

Similarly, there are physical hardware limitations. Each hard disk can only read from one track on the spinning platter (or one sector of flash) at a time. Each network interface can only send one network packet at a time. Thus, having more threads waiting for each particular piece of hardware at the same time is inefficient. Over-threading a program is very likely to run into this problem, where the threads don't give you any performance, but end up costing in resources and complexity (and bugs!)

Now, this is how high-performance servers end up being structured. If you're currently testing with 4 players, chances are that you don't need to implement this structure. You could get away with a single thread for a long while! And, once you start adding threads, adding one thread per subsystem (collision detection, networking, disk I/O, interest management, ...) is generally easier to debug and optimize than trying to add one thread per user, where each thread can potentially "do all the things" and the number of threads is not bounded or even managed.
0

Share this post


Link to post
Share on other sites

Wow, I did not even know how this works, learning something new everyday. Sounds like a fun challenge!!! Thanks for all the help, can't wait to see the performance difference once classes are over today. I suppose after this, my team and I will be having a discussion about some of the game mechanics we would like to see in the game, and then how we will go about using all of your advices to restructure the way we are thinking about our server at the moment. But I understand now that having a manageable and bounded # of threads dedicated to certain tasks, which do not wait on the same hardware and utilize ASIO to the best of their capabilities, as well as a queue based system for passing tasks to threads is the best way to go about it. I will post back here once I see how well it worked out. Thanks soo much!!

 

Just out of curiosity actually, how would one go about not locking the queues as they are being read from and added to in different threads. Isn't that dangerous, at least from my clearly rudimentary understanding of threads, mutex's are required for synchronous communication but I should be going for asynch, however accessing the same memory in 2 places at the same time is dangerous no?

Edited by PranavSathy
0

Share this post


Link to post
Share on other sites
For a FPS game with smallish levels and smallish number of players (such as Quake) sending a single packet with information about all players for each network tick is totally fine, simple, and will perform well.

Well yes, from a pure performance point of view, it's OK for a Quake-style of game (not so for something much bigger, though).

 

But my point about knowledge remains. In a game where several people compete, it can be troublesome to provide information to people that they actually can't know. Such as those shoot-through-wall aimbots, or other things. Imagine someone implements a minimap where enemies show as little dots (and nobody using the genuine client has such a mini-map). Imagine knowing what weapon, how much armour, and how much health your opponents have (when you shouldn't!), and where they hide. Imagine knowing which door they'll come through before they know (because you can "see through" the door).

No player should ever know the whole world, normally. Not unless it doesn't matter anyway.

Edited by samoth
0

Share this post


Link to post
Share on other sites

Quote
WinIOCP, ePoll or KEvent
This is an excellent tip for TCP, but less for UDP.  With TCP, you have many sockets in an "unknown" state, but you can only do one thing at a time, so you need some special magic that tells you which one is ready so you can read from (or which overlapped operation has just finished).
 
Using UDP, you have a single socket, no more. And that single socket either has something to read, or it doesn't. Instead of blocking on a function that tells you "it's now ready" and then calling recvfrom, you can just as well block on recvfrom, which will conveniently return when something came in. Then push that packet on a queue (making the processing someone else's problem!) and immediately call recvfrom  again.


Actually, it applies equally well to both tcp and udp. From the os side of things it doesn't matter if it is tcp or udp, multiple sockets or single sockets, these systems are just going to generate events which will wake a thread to pull the data. Basically what you are describing with waiting on recvfrom and then pushing to a queue is exactly what the underlying OS async solutions would be doing for you. The benefit, even with UDP, is you can throw more threads to wait on work at the OS solution without writing the queue portion yourself. Additionally, in WinIOCP at least, you will bypass much of the normal user space buffering of packet data and instead the data is directly written to your working buffers.

This is getting into fairly arcane and difficult coding areas but the general point is that the OS level interaction is effective in both cases. In fact, I tend to think that for udp these systems are even more effective since unlike tcp you will get more smaller events so in the long run the more efficient use of the resources will add up to a major win.
0

Share this post


Link to post
Share on other sites

how would one go about not locking the queues as they are being read from and added to in different threads


In the simplest case, just use a lock to protect the queue, and use a linked list or the like to queue your items.
Because the queue is only locked for a very short amount of time (to enqueue or dequeue a single item,) there is no real risk of contention being a problem.
If you want to get fancier, look into "lockless FIFOs," which can be implemented incredibly cheaply, as long as you have only a single reader and a single writer per queue (which is actually the common case.)

However, seeing as you're still in school, I *highly* recommend avoiding threads in this server, at least for now. You simply do not need them, until you can show that the game server processing you do really does need multiple cores to achieve performance goals.
And, if you absolutely HAVE to use threads (this might be a mandatory project or something,) I'd highly recommend just using a simple mutex or other light-weight lock (CRITICAL_SECTION in Windows for example) to protect adding to and removing from each queue.
0

Share this post


Link to post
Share on other sites


And, if you absolutely HAVE to use threads (this might be a mandatory project or something,) I'd highly recommend just using a simple mutex or other light-weight lock (CRITICAL_SECTION in Windows for example) to protect adding to and removing from each queue.

 

I completely agree, as I stated also, don't go and thread things until you can justify it.  But, as a note, critical section as of Vista is no longer notably faster than a mutex.  Window's moved the first order checks into user space like all other OS's which makes it basically the same as a critical section anymore.

0

Share this post


Link to post
Share on other sites

Actually, it applies equally well to both tcp and udp. From the os side of things it doesn't matter if it is tcp or udp, multiple sockets or single sockets, these systems are just going to generate events which will wake a thread to pull the data.

Yes and no. The hardware generates interrupts of course, and the OS eventually wakes a thread, but not necessarily (even not usually) in a 1:1 correlation. However, given poll+receive you have two guaranteed user-kernel-user transitions instead of one.

For TCP that makes sense since you somehow must mulitplex between many sockets. There is not much of a choice if you wish to be responsive. -- For UDP, it's wasting 10k cycles per packet received for nothing, since there is only one socket, and nothing to multiplex. You can just as well receive right away instead of doing another round trip. Same for IOCP where you have two roundtrips, one for kicking off the operation, and one for checking completeness.

 

Throwing more threads at the problem doesn't help, by the way. Operating systems even try to do the opposite and coalesce interrupts. The network card DMAs several packets into memory, and sends a single interrupt. A single kernel thread does the checksumming and other stuff (like re-routing, fragment assembly), being entirely bound by memory bandwidth, not ALU. It eventually notifies whoever wants some.

A single user thread can easily receive data at the maximum rate any existing network device is capable delivering, using APIs from the early 1980s. Several threads will receive the same amount of data in smaller chunks, none faster, but with many more context switches.

 

 

Basically what you are describing with waiting on recvfrom and then pushing to a queue is exactly what the underlying OS async solutions would be doing for you. The benefit, even with UDP, is you can throw more threads to wait on work at the OS solution without writing the queue portion yourself.

Yes, this is more or less how Windows overlapped I/O or the GLibc userland aio implementation works, but not traditional Unix-style nonblocking I/O (or socket networking as such). Of course in reality there is no queue at all, only conceptually insofar as the worker thread reads into "some location" and then signals another thread via some means.

 

 

Additionally, in WinIOCP at least, you will bypass much of the normal user space buffering of packet data and instead the data is directly written to your working buffers.

Yes, albeit overlapped I/O is troublesome, and in some cases considerably slower than non-overlapped I/O. I have not benchmarked it for sockets since I deem that pointless, but e.g. for file I/O, overlapped is roughly half the speed on every system I've measured (for no apparent reason). The copy to the userland buffer does not seem to be a performance issue at all, surprising as it is (that's also true for Linux, try one of the complicated zero-copy APIs like tee/splice, and you'll see that while they look great on paper, in reality they're much more complicated and more troublesome, but none faster. Sometimes they're even slower than APIs that simply copy the buffer contents -- don't ask me why).

 

But even disregarding the performance issue (if it exists for overlapped sockets, it likely does not really matter), overlapped I/O is complicated and full of quirks. If it just worked as you expect, without exceptions and special cases, then it would be great, but it doesn't. Windows is a particular piss-head in that respect, but e.g. Linux is sometimes not much better.

Normally, when you do async I/O then your expectation is that you tell the OS to do something, and it doesn't block or stall or delay more than maybe a few dozen cycles, just to record the request. It may then take nanoseconds, milliseconds, or minutes for the request to complete (or fail) and then you are notified in some way. That's what happens in your dreams, at least.

 

Reality has it that Windows will sometimes just decide that it can serve the request "immediately", even though they have a very weird idea of what "immediately" means. I've had "immediately" take several milliseconds in extreme cases, which is a big "WTF?!!" when you expect that stuff happens asynchronously and thus your thread won't block. Also there is no way of preventing Windows from doing that, nor is there anything you can do (since it's already too late!) when you realize it happened.

Linux on the other hand, has some obscure undocumented limits that you will usually not run into, but when you do, submitting a command just blocks for an arbitrarily long time, bang you're dead. Since this isn't even documented, it is actually an even bigger "WTF?!!" than on the Windows side (although you can't do anything about it, at least Microsoft tells you right away about the quirks in their API).

 

In summary, I try to stay away from async APIs since they not only require more complicated program logic but also cause much more trouble than they're worth compared to just having one I/O thread of yours perform the work using a blocking API (with select/(e)poll/kqueue for TCP, and with nothing else for UDP).

Edited by samoth
0

Share this post


Link to post
Share on other sites

 

For a FPS game with smallish levels and smallish number of players (such as Quake) sending a single packet with information about all players for each network tick is totally fine, simple, and will perform well.

Well yes, from a pure performance point of view, it's OK for a Quake-style of game (not so for something much bigger, though).

 

But my point about knowledge remains. In a game where several people compete, it can be troublesome to provide information to people that they actually can't know. Such as those shoot-through-wall aimbots, or other things. Imagine someone implements a minimap where enemies show as little dots (and nobody using the genuine client has such a mini-map). Imagine knowing what weapon, how much armour, and how much health your opponents have (when you shouldn't!), and where they hide. Imagine knowing which door they'll come through before they know (because you can "see through" the door).

No player should ever know the whole world, normally. Not unless it doesn't matter anyway.

 

So have a little flag that tells you whether or not the receiving client can see this player (to tell them if the player should be visible at their end) and don't send the new data about that player if they aren't visible. Problem solved.

0

Share this post


Link to post
Share on other sites

 

For a FPS game with smallish levels and smallish number of players (such as Quake) sending a single packet with information about all players for each network tick is totally fine, simple, and will perform well.

Well yes, from a pure performance point of view, it's OK for a Quake-style of game (not so for something much bigger, though).

 

But my point about knowledge remains. In a game where several people compete, it can be troublesome to provide information to people that they actually can't know. Such as those shoot-through-wall aimbots, or other things. Imagine someone implements a minimap where enemies show as little dots (and nobody using the genuine client has such a mini-map). Imagine knowing what weapon, how much armour, and how much health your opponents have (when you shouldn't!), and where they hide. Imagine knowing which door they'll come through before they know (because you can "see through" the door).

No player should ever know the whole world, normally. Not unless it doesn't matter anyway.

 

 

 

 

 

For a FPS game with smallish levels and smallish number of players (such as Quake) sending a single packet with information about all players for each network tick is totally fine, simple, and will perform well.

Well yes, from a pure performance point of view, it's OK for a Quake-style of game (not so for something much bigger, though).

 

But my point about knowledge remains. In a game where several people compete, it can be troublesome to provide information to people that they actually can't know. Such as those shoot-through-wall aimbots, or other things. Imagine someone implements a minimap where enemies show as little dots (and nobody using the genuine client has such a mini-map). Imagine knowing what weapon, how much armour, and how much health your opponents have (when you shouldn't!), and where they hide. Imagine knowing which door they'll come through before they know (because you can "see through" the door).

No player should ever know the whole world, normally. Not unless it doesn't matter anyway.

 

So have a little flag that tells you whether or not the receiving client can see this player (to tell them if the player should be visible at their end) and don't send the new data about that player if they aren't visible. Problem solved.

 

 

Indeed, you can still keep it simple. UDK uses relevancy checks for each object when it is replicated to a client, so you could simply set an exclusion radius, and/or perform a line of sight test.

0

Share this post


Link to post
Share on other sites

and don't send the new data about that player if they aren't visible. Problem solved.


... and also tell the player when you're going to stop sending updates about the entity, so there are not "dead" entities visible on the player's screen. And re-check each entity against each other entity each time they move (enough) to perhaps change relevancy. An re-send all relevant data about the entities when they are re-introduced by again becoming relevant, because the entity may have changed clothing or spell effects or whatever.

THEN "problem solved." But for games which don't have massive player counts, sending all the entities is typically easier and not less efficient. So do that if your goal is to get to a working game sooner, rather than later.
0

Share this post


Link to post
Share on other sites

 Wow, just a few days so many amazing posts haha. Anyways, I rewrote my server as per the several suggestions here, namely 0 threading, and a "Network Tick" type functionality which sends comprehensive updates of all players to every client. This has proven to work far better than any previous iteration of the server I had made. We ran 5 clients today and it worked amazingly. Of course, on the internet there will be a lot more latency than my school's intranet, but we will combat one problem at a time.

 

The next step I suppose is that we can finally move forward to implementing the asynchronous functionalities that was mentioned here, as well as "lockless FIFOs" and figuring out how many threads we will spawn initially and what each one will handle. But as of right now, the team is delighted to see our avatars running around in virtual space haha. Thank you soo much!

0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0