Metrics for multiplayer networking performance

Started by
5 comments, last by evillive2 7 years, 1 month ago

Is there previous work for something like that? Does it even make sense?

I'm a complete beginner when it comes to network programming. From what I've read it sounds like people mostly try out different networking implementations (regarding protocols used, prediction and interpolation approaches, etc.) by hand and end up using what feels best.

I'm wondering whether or not there exist metrics that measure various aspects of an implementation specifically suited or tailored to games. I figure it would be helpful for automated testing, and maybe speed up the development process when you're trying out a bunch of different approaches.

And if there aren't, I also wonder if this would be worth putting some work into to come up with useful metrics, or if consensus is "nah, just try until you find something that works best, how networked gameplay feels has too much subjective/complex elements attached be quantified by metrics" or something like that.

edit: I know this is a very generalized question. What a metric would look like probably depends a lot on what kind of quantities you're looking at. Am I trying to synchronize player positions as best as I can across multiple players? Server-Client or P2P? Etc.? I'm basically having a hard time googling for this stuff and wonder if people more experienced in the field have come across useful stuff. Open to anything.

Advertisement

In graphics, you can measure "frames per second" on a target system, and tune your graphics complexity to hit specific goals.

Similarly, a sound mixer might measure "percent of CPU used," although sound is so cheap and CPUs so fast these days that it's unlikely to ever be a few percent of a single core, even with massive reverbs and other effects.

Networking is even less data than audio, so it's unlikely to ever take enough CPU power to show up on a profile.

Instead, you decide on a particular strategy, and your implementation either implements that strategy, or it doesn't.

Once it correcty implements the strategy (without unnecessary problems/delays/whatever) then that's as good as it gets.

It's still interesting to measure variuos properties (especially time between various points along the input / event / processing / forwarding chain) to compare against what your theoretical model says it should be, to look for bugs and areas of improvement.

There are parameters you can tune. For example, how many simulation steps do you gather together into a single network packet? How much de-jitter buffer do you use on the receiving end of a connection? Keeping a firm hand on these, and understanding the gameplay impact of changing them, is important, but it's usually more black and white than most other parts of the game. Either your messages make it to the server on time, or they don't.

enum Bool { True, False, FileNotFound };
I feel like that's a tad simplistic.

Granted, most people looking to this forum to get started aren't dealing with scales large enough to care, but things like bytes per message are still important metrics. You can saturate a link or (worse) incur huge bills by being sloppy with bandwidth, and knowing that your messages are at least reasonably efficient is a good thing.

I also disagree about network rarely appearing on the CPU metrics. Poor netcode is actually a huge potential source of stalls and resource contention. You won't see the CPU running hot necessarily, but you can bet your shoes that bad netcode will jump out in profiling data.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

Not sure about the trying out different networking strategies. Some are just more suited for some types of games than others, I'd think that strategy is picked based on requirements and that's it. Switching between strategies would a huge task and they often come with different gameplay limitations, making it infeasible to switch.

You could always measure the total delay from player 1 seeing an action on his screen to player 2 seeing the same action on his (there's someone on YouTube performing these kinds of tests). But it's not an automated test in code, he uses a high speed camera for that.

@ApochPiQ: You are correct, of course, buggy network code can (like any buggy code) pop up on a profiler, and poor use of bandwidth can of course lead to an untimely death. Profiling your application's usage of all resources (CPU, RAM, disk, network, user attention, etc) is always useful and will always find surprising opportunities for improvement. Networking is not immune to that!

Metrics that are related to networking that application operators typically track over time, even after networking has been shown to be correctly implemented, will include things like:

- use of bandwidth (packet count and byte count per user connection)

- latency (physical level ping times, and application-level round-trip times)

- tolerance to packet loss (how much divergence happens when packets are lost)

These are generally very application specific, and thus measuring them is important to the application, but really hard to compare between applications. Is the impact of the packet loss metric for Gears of War comparable to the packet loss for Eve Online? Probably not :-)

There are of course also the low-level metrics used to judge your provider's quality; packet loss, connectivity, actual throughput, queue depths, etc.

These are usually well served by existing operational metrics, and will often just be scraped (through SNMP or whatever) from your edge routers.

enum Bool { True, False, FileNotFound };

I'm a complete beginner when it comes to network programming. From what I've read it sounds like people mostly try out different networking implementations (regarding protocols used, prediction and interpolation approaches, etc.) by hand and end up using what feels best.


I'd say the opposite - in my experience, it's possible to have a very good idea right from the start of what sort of implementation you need, and to make it first time. You probably have an idea of how many players you have, how fast-paced the game is, and what proportion of the messages need to be reliable. I'd argue that is all you need to know.

People - on this forum, especially - get very hung up on things like different interpolation strategies and whether they need to rewind actions and the like, but many games don't need any of that at all, and those that do, can usually just pick something arbitrarily.

(I appreciate this is ignoring the metrics part of the question, but I would say that you need to measure metrics that are relevant for your game, rather than attempting to measure networking implementations in isolation.)

A little late to the conversation but a metric in general is just a data point. How important it is to you depends on the context in which you look at it. That being said, much of your generic networking specific metrics from the server side perspective can come from outside of your application from existing network tools such as SNMP, ntop, snort and more. While the learning curve of these runs the gambit from small to zomg! they exist specifically for the reasons I believe you asked - though even more generic since they don't care if your application is a game or not. You can get packets per second, bandwidth utilization to/from specific clients, jitter, packet loss and so much more. I work at an ITSP and these tools are essential for us as we can quickly identify if a new application was introduced on our network or if an application or even specific client is acting up or having issues.

There are older tools like rrdtool with C and many other language bindings that you basically send key value pairs at timed intervals to for things you would like to keep track of and also offers a way to visualize that data. Things like grafana and influxdb have taken this concept to another level of scale being used by highly distributed web services today where network and database transaction metrics help them prioritize optimizations and user experience.

As for what metrics look like - they are generally just key value pairs you export at timed intervals and some other tool helps you visualize it in the manner appropriate to you. What is interesting to you? If your application can export metrics to a message/event queue and let something else do the imports to whatever tools you want, the tools you use can change independent of your application.

Evillive2

This topic is closed to new replies.

Advertisement