Sign in to follow this  
Exustio

Compressing data before sending

Recommended Posts

Hello, I am working on a graphical MUD and have noticed that I am currently using a lot of bandwidth. I do ofcourse plan to reduce this by being more restrictive with what I send, but I have also considered to start compressing the data before sending it. My question is, how should I do this to make it work good. My first Idea is that I should put everything I want to send into a buffer, and by a certain interval (25 ms or so) I will compress and send everything that is in the buffer at the time. Is this a good way to do it?

Share this post


Link to post
Share on other sites
There are a couple of warning flags in your post. It sounds like you're currently not buffering data... which could imply that you'd be hammering the bandwidth with per-packet overhead for TCP/UDP headers if you send lots of small bits of data individually. (Which protocol are you using, incidentally? And language/library?) That alone could be a big waste of your bandwidth. Do you not have a predefined message format that you use to send your data? If so, why not just compress the contents of each message individually? And if not, what are you using instead?

[Edited by - Kylotan on March 29, 2010 9:00:57 AM]

Share this post


Link to post
Share on other sites
I am using TCP with winsock2.

Well, I have a predefined message format so that I can collect all data needed for a certain operation into one buffer and send it when it is ready, and the other side will know how much data to receive. So if a client requests to move it's character the server gather information about the map, units and objects and then call send(), and the client will then know how much data will be incoming and which parts will contain which information without first receiving information about it. There are ofcourse exceptions where the amount of data must be different, e.g. depending on how many characters that are within range.


My point is that certain operations, like updating a character's HP require kind little data to be transferred. I am already merging this kind of calls together before sending them, so that every 50 ms I send one update that includes new information about every character that has changed, if any change has occured.
But I am considering using this method for all communication, so that I more effectivly can compress the data using a library like zlib.

Share this post


Link to post
Share on other sites
You might consider an event system for health updates and movement and what not.

As for sending data every 25ms, for a MUD, isn't that overly excessive?

Have you tried tuning that number down to 100ms? How's it feel there?

25ms seems far too frequent for a mostly-event based game like a MUD.

Share this post


Link to post
Share on other sites
The compression libraries I have worked with usually compress when writing to the outgoing buffer:

buffer.WriteInt32(value)
buffer.WriteInt16(value)

etc.

Share this post


Link to post
Share on other sites
Are there any compression libraries out there capable of handling this type of streaming in a reasonable way? Because I'm fairly certain that the usual suspects (zlib, bzip2, lzma, etc) would force you to compress individual packets.

In theory it should be straightforward to modify a LZ77 coder to see the earlier packets, or an adaptive arithmetic coder to flush its output buffer at the end of a message, and still get excellent compression rates.

Share this post


Link to post
Share on other sites
Quote:
Original post by Exustio
Hello, I am working on a graphical MUD and have noticed that I am currently using a lot of bandwidth. I do ofcourse plan to reduce this by being more restrictive with what I send, but I have also considered to start compressing the data before sending it.

My question is, how should I do this to make it work good.
My first Idea is that I should put everything I want to send into a buffer, and by a certain interval (25 ms or so) I will compress and send everything that is in the buffer at the time. Is this a good way to do it?



I have SendCompressedBuffer and ReceiveCompressedBuffer classes that use bzip2 to compress data. After working with some encryption libraries, I highly recommend bzip2 as it's API/documentation/... are easy to work with and work as advertised.


Brian Wood
http://webEbenezer.net
(651) 251-9384

Share this post


Link to post
Share on other sites
zlib supports streamed compression too.
I use TCP with disabled nagle, buffer the messages myself and compress them packet-wish with zlib (fastest provides the best ratio for my usecase). And i use a homebrewed library for determinate which data has to be sent. It replaces the 'server-redirect-messages-to-clients'-approach by a 'server-knpws-what-a-client-know-determinates-changes-and-visibilty-and-sends-only-important-data'-approach.
This combination or at least some parts may be interesting for you either:
http://syncsys.sourceforge.net/ (opensource of course)

Share this post


Link to post
Share on other sites
Quote:
Original post by catch
25ms seems far too frequent for a mostly-event based game like a MUD.
For a MUD? I would think 40 network updates per second is more than any mainstream game should attempt, regardless of the genre. Unless he can guarantee a Ethernet connection, that is way too much traffic.


Compression alone will not help a design that is fundamentally flawed.



The OP should figure out his actual bandwidth requirements, including overhead. A bad ratio here can kill any game. The solution there is to send fewer packets. If the ratio is fine but the total bandwidth is an issue, then he should first consider a more efficient data structure, and then consider compression as a secondary possible improvement.

The OP should also evaluate the effect of latency in his engine compared against the number of updates. What happens if there is some network instability and those 25ms updates get delayed for 3-5 seconds? If it takes a while to catch up, that could seriously impact the game.

Those answers would help with the questions about what to do next.

Share this post


Link to post
Share on other sites
Given that we're talking about TCP here, my suspicion is just that the server is trying to generate far too much data for some reason. Packet overhead probably isn't that big of a deal - if it's being sent regularly, it'll be getting buffered, and if it's being sent infrequently, the bandwidth wouldn't be that high. Of course, the original poster's bandwidth measurement could be wrong, too. Either way, I think compression is the wrong approach at this point. Some real figures on message size and frequency would help.

Share this post


Link to post
Share on other sites
General compression (zlib, lzma, etc) on packets should be a last resort. Usually, the results are quite disappointing, and you will be adding a lot of overhead to your networking (anywhere from 2x to 20x or more overhead per byte sent). There is nothing wrong with compressing specific things known to compress decently, such as verbose messages (e.g. a long quest description, pages of a book, etc), but you don't want to just mass-compress everything over the network.

First, figure out what exactly is taking so much bandwidth. Add an aggregate counter to your server to keep track of how much data is sent per each packet ID, then start working on the most expensive stuff. Often times, excessive bandwidth can be found in:

1. Possible client-side caching. Don't send full game messages (e.g. "You got # gold") to the client. Instead, store this all on the client, and just send the unique ID of the game message and the parameters, and let the client take care of the rest.

2. Sending too much of the same thing. If you are sending the new position of entities every tick (and aren't using UDP with the intention of a low-latency game like a FPS), your bandwidth will definitely take a big hit.

3. Lack of delta updates. If a single stat changes, don't send the new value for all the stats. If a character's name or sprite changes, don't send every bit of data to recreate them. Send updates on what has changed, nothing more. Though don't get carried away with this one before profiling, as it can make maintenance much more difficult.

4. Too many bits. If you have tile-based maps < 255x255, don't send a 32-bit integer as the position. If you have a bunch of bools, don't send them each as a byte. Though again, don't go overboard with this, as it puts some expansion limitations on you. A good bit stream library can be really helpful here.

Share this post


Link to post
Share on other sites
Quake3 used Huffman encoding to compress their final packets. I don't know if it is a dynamic or static table though.

But yeah, in general, that sort of compression should be a last resort. If you send too much data, then the problem is elsewhere, and you should look at minimising the transmission. Compressing packets is more like an optimisation rather than a solution to a problem.

There are schemes such as delta-encoding that can work for you as well. But your best saving will be in managing the network tick and prioritise entity updates.

Share this post


Link to post
Share on other sites
Yeah, don't get me wrong, I really intend to decrease the amount of data that is transferred, but you don't need to be Einstein to figure that out. What I was wondering was rather if it is smart to have the server responding by a certain time interval rather than having it respond as soon as it gets a request. As some of you has already pointed out the TCP headers will take a lesser portion of the total bandwidth usage if the data is bunched together. Should this be the rule for all communication or just some of it?

At the moment I transfer responses for certain requests instantly (I still bunch all data needed for a certain request together before sending), but updates which tells what other units are doing, and stuff like that, are sent by a certain time interval so that the data can be bunched together.

And about the time interval, I know that 25 ms is kind of low, but at the moment I am running the whole game at higher speed, thus making the simplest forms of bugs appearing faster. I agree that 100 ms is more like a realistic time interval.

Share this post


Link to post
Share on other sites
Quote:
Original post by Exustio
What I was wondering was rather if it is smart to have the server responding by a certain time interval rather than having it respond as soon as it gets a request. As some of you has already pointed out the TCP headers will take a lesser portion of the total bandwidth usage if the data is bunched together. Should this be the rule for all communication or just some of it?

All communication. The only situation where a packet should be sent immediately is for calculating latency. (Then again game loop latency is often more important. That is the time between client input and the next update tick on the server).

Queue up data and design it so that your network outgoing packets can be separated from your server update. For instance, you might update your game at 30 updates/sec but your packets get sent out 5 to 10 packets/sec depending on the player (throttling).

Quote:
Original post by Exustio
I know that 25 ms is kind of low

Wrong use of terminology. 25 ms represents a frequency in this case (25 ms between an action). So 25 ms is a high frequency (40 updates/sec) where as 100 ms is a lower frequency (10 updates/sec).

Share this post


Link to post
Share on other sites
Quote:
Original post by Exustio
At the moment I transfer responses for certain requests instantly (I still bunch all data needed for a certain request together before sending)

That's fine.
Quote:
, but updates which tells what other units are doing, and stuff like that, are sent by a certain time interval so that the data can be bunched together.

That's not. Send this stuff less often, and your problem will be solved. Also reduce the area of interest so that a player is only notified of changes to relevant units.

Quote:
And about the time interval, I know that 25 ms is kind of low, but at the moment I am running the whole game at higher speed, thus making the simplest forms of bugs appearing faster. I agree that 100 ms is more like a realistic time interval.

If getting a bug to appear 75ms sooner is beneficial to you, then you're doing something wrong. What sort of bugs require this?

Share this post


Link to post
Share on other sites
Quote:
Original post by Spodi
General compression (zlib, lzma, etc) on packets should be a last resort. Usually, the results are quite disappointing, and you will be adding a lot of overhead to your networking (anywhere from 2x to 20x or more overhead per byte sent). There is nothing wrong with compressing specific things known to compress decently, such as verbose messages (e.g. a long quest description, pages of a book, etc), but you don't want to just mass-compress everything over the network.




I find compressing everything works well in what I'm working on, but can imagine that it wouldn't work well if you have a lot of small ( < 1000 bytes) packets being processed.


Brian Wood
http://webEbenezer.net
(651) 251-9384

Share this post


Link to post
Share on other sites
Quote:
if you have a lot of small ( < 1000 bytes) packets


A 1000 byte packet is *huge* for games.

If your entire upstream is 8 kB/sec, which should include voice chat, that would mean you could only send 8 packets/second.

Both Sony and Microsoft put the upstream "sweet spot" (99% coverage) at 8 kB/sec, btw.

Share this post


Link to post
Share on other sites
Quote:
Original post by hplus0603
Quote:
if you have a lot of small ( < 1000 bytes) packets


A 1000 byte packet is *huge* for games.



I guess that explains why compression isn't as important in games as it is in other areas. If there's not enough data to work with, the results of compression aren't as rewarding.


Brian Wood

Share this post


Link to post
Share on other sites
The way I like to view it is that games apply compression at a much higher (semantic) level. Entropy coding style compression (be it lzw, huffman, or something else) can help some (say, 30%?) but the big gains (thousands of percent) come from the application layer.

After all, a game is nothing more than a really big, advanced query that runs 60 times per second :-)

"select pixel from .... where input=..." :-)

Share this post


Link to post
Share on other sites
oh yeah I forgot, if you're interested in quick theory you could read this article. The codes kind of verbose so it's rather easy to understand also.

If you have an example of what you're sending we could give more precise strategies.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this