packet parsing

Started by
50 comments, last by hplus0603 16 years, 2 months ago
Quote:Original post by chessmaster42
The stream method works but it is too easy to get malformed data that way.


In fact, the opposite is true. The methods detailed above write some simple methods to write out basic types and then everything is built on top of that. It's very hard to get that wrong because the structure is done essentially once. With a naive system like yours, you need to write a new packet definition for every new piece of info you need to send, and need to worry about how to represent lists, binary data, pointers, etc.

Calling the stuff in this thread "the stream method" is a disservice, by the way. Nobody's talking about explicitly typing out a long list of members to send to a stream and then explicitly typing a similar list to read them back in. Instead the idea is to register properties and that ordering is used automatically for reads and writes, regardless of where the data is going to or coming from (eg. binary file, network, XML). But even if they were just writing out each member one by one to a stream, that's not really any more error-prone than having to inject each member into a packet struct.
Advertisement
Quote:- Security. Serialization allows you to add implicit checks to serialization itself. If you read raw structs there's nothing preventing the client from sending array length at 0xffffffff, resulting in a big boom on server. Of course, you can add explicit tests for such conditions, but those add to code bloat, since they need to be hard-coded for every field.


That doesn't make any sense at all. If the packet is a struct then the client cannot change any array sizes or anything of that nature. Each variable is static in size. Am I just missing something here perhaps?

Quote:
In fact, the opposite is true. The methods detailed above write some simple methods to write out basic types and then everything is built on top of that. It's very hard to get that wrong because the structure is done essentially once. With a naive system like yours, you need to write a new packet definition for every new piece of info you need to send, and need to worry about how to represent lists, binary data, pointers, etc.


I've said this once and I'll say it again, I did NOT include all of my code. Many of my packets use pointers so that I can easily expand if needed. I just didn't post it because it wasn't needed.

Quote:
Calling the stuff in this thread "the stream method" is a disservice, by the way. Nobody's talking about explicitly typing out a long list of members to send to a stream and then explicitly typing a similar list to read them back in. Instead the idea is to register properties and that ordering is used automatically for reads and writes, regardless of where the data is going to or coming from (eg. binary file, network, XML). But even if they were just writing out each member one by one to a stream, that's not really any more error-prone than having to inject each member into a packet struct.


Thank you. You're the first person to make a distinction between the confusing jumble of definitions of "the stream method". And what you say is true and makes sense.

It's not enough just to program, you have to put your heart in soul into every byte of source in order to show your worth.
Quote:Original post by chessmaster42

That doesn't make any sense at all. If the packet is a struct then the client cannot change any array sizes or anything of that nature. Each variable is static in size. Am I just missing something here perhaps?


Client can change anything it wants. One doesn't even need a client to do that, just run wireshark and spoof some packets.

Also, how will you send a variable-sized list of objects?

Let's say you have two cluster nodes which need to register mutual interest. Each of them sends a list of objects. This list might be empty, have 10 members or have 10,000. The theoretical overlap is 2^32 objects. Even if that many cannot be sent (16gigs), the upper limit is arbitrary. Would you always send a hard-coded packet n megabytes in size, just so you're safe?

Or, delta state. You need to send all the changes that have occured on an object. The motivation for this is minimization of network traffic. So you send a list of tuples (#1, 17)(#4, "NewName")(#58, 2.43)(#78, 0). This takes dozen or so bytes to send.

If you send this as fixed structure, you'll always need to send entire state, thereby defeating the main reason for delta states, which can save 80-95% of network bandwidth. Or, you'll need to specify length of data you're sending. And presto, you have your std::list in your packet.

And how do you send a string. Let's say my packet contains this:
struct NamePacket {  char name[32];};


What happens if client sends 32 non-zero bytes. You'll have buffer overrun.

Worse yet!

Using this is horrible for security. Look at Eternal lands post-mortem. Since they were using such approach, they exposed user's passwords in buffers that weren't cleared beforehand. Approach, which should not happen with explicit serialization. The above lesson is real-life example, not some contrived scheme.

Yes, sending &struct works. But it should only be applied in rare circumstances in trusted and controlled environments for small-scale projects.

Everywhere else the difficulties, both technical and those related to management are impractical, especially since the only benefits (such as performance), are too small, if they exist at all.

BTW: In mostly C-based project, using statically allocated structures that are shared directly across network sending raw data is great.

There used to be a library that used this approach for shared memory C++ allocator, and there was a project that used same approach for distributed shared memory.

This type of approach has definite use, and several practical applications. But for any reasonably complex application, especially one where peers are not trusted or not controlled, I would be hard to convince that sending raw memory has any benefits.
structs just aren't scalable. Also if server-side verification is set up properly security should never be a problem.

Okay easiest way to figure this out. Pretend you have 100 entities and you want to serialize only the delta information. How would you do it chessmaster42?

Ideally one iterates and uses 1 bit bools to tell if the next state data has changed.
packet.//write the size
foreach
//write entity ID or some identifier
packet.WriteBool(position.HasChanged());
if(position.HasChanged()){
//Write position vector
}
//continue with all data.
}
If nothing changes for an entity the worst that can happen without further optimization is a bit for every broken up delta part.

Very trivial to write and format for the client and server. )Another way which I'm still wrapping my brain around is Hplus's method.) If you have a very solid binary packet its easy to serialize anything. Like a list, packet.WriteList<string>(list, 0, list.length); easy stuff that speeds up programming time. Need to send a lobby server list just iterate and serialize the data by data types in your own format.

Creating a struct for everything is just code bloat and doesn't do anything that serializing the data types manually won't do.
Quote:Original post by chessmaster42
I've said this once and I'll say it again, I did NOT include all of my code. Many of my packets use pointers so that I can easily expand if needed. I just didn't post it because it wasn't needed.


So post your code. You can't claim your method is at all better if you're not explaining how you'd address the same problems we're talking about, ie. sending potentially complex data structures to the network and potentially other destinations in a safe manner.
Quote:Original post by Kylotan
Quote:Original post by chessmaster42
I've said this once and I'll say it again, I did NOT include all of my code. Many of my packets use pointers so that I can easily expand if needed. I just didn't post it because it wasn't needed.


So post your code. You can't claim your method is at all better if you're not explaining how you'd address the same problems we're talking about, ie. sending potentially complex data structures to the network and potentially other destinations in a safe manner.


Dude, first of all I'm not trying to pick a fight here. I was merely presenting a simple alternative for the guy who was just getting started in this (the guy who started the post). I tried to explain this earlier. Second I'm NOT claiming my method is better. It's merely an alternative. And no, my method cannot natively send complex data structures but that isn't what it's designed for. It's designed for fast, secure, and efficient netcode for client-server communications primarily for multiplayer games.

I'm sure your method works great for inter-server communications but I don't think that is the intent of the original poster. Hence my posts here. I'm trying to be helpful and present alternatives rather than just going with the flow.

As far as posting the entirety of my code, that's an unfortunate no-can-do as it's part of a commercial project. And I know that makes me look like a fraud when I talk about the code that no one can see. But if you guys want to believe that, that's your own prerogative. You don't have to believe me.
It's not enough just to program, you have to put your heart in soul into every byte of source in order to show your worth.
The simple fact of the matter is that sending structs, the way you do, will not work or has security and efficiency problems for anything complex, where "complex" includes things like lists or arrays of things, strings, etc. You can work around these issues with a lot of point solutions. For example, if you have an array of "object IDs" where the array can be of size 0 .. 1000, always sending 1000 object IDs (999 of which are empty) is an obvious non-starter.

Once those issues are worked around, you end up with something that takes in-game data, and formats it into some hunk of bytes that you send -- which is exactly what the others (including me) on this thread have been discussing. Posting just the code that says "send(socket, &struct, sizeof(struct), 0)" is dangerous and a disservice within this context, in my opinion (but you are entitled to your opinion).
enum Bool { True, False, FileNotFound };
Quote:Original post by chessmaster42
I was merely presenting a simple alternative for the guy who was just getting started in this (the guy who started the post). I tried to explain this earlier. Second I'm NOT claiming my method is better. It's merely an alternative. And no, my method cannot natively send complex data structures but that isn't what it's designed for.

Maybe I don't understand why you use this method at all. It seems inferior to other methods. Also I don't think it will help the OP much as you didn't really show any code. I can understand if it's for a commercial project, but showing examples of what you mean instead of just saying, "it's an alternative and is awesome" is a lot less vague.


Quote:Original post by chessmaster42
It's designed for fast, secure, and efficient netcode for client-server communications primarily for multiplayer games.

If you're using structs then where does the bit packing come in that you say you are using? I don't understand what you mean by this. I mean when I think of bit packing I think of things like this where data types are serialized in a format.

Quote:Original post by chessmaster42
I'm sure your method works great for inter-server communications but I don't think that is the intent of the original poster. Hence my posts here. I'm trying to be helpful and present alternatives rather than just going with the flow.
There is no "flow". Some people like using structs and others use "binary packets" with contiguous data formats. I don't think there is a difference between server-server communication and client-server if things are set up correctly. I design larger scale multiplayer games, and the cluster server packets are not designed any different than the server-client packets. Maybe you designed something wrong or are over complicating things?
Quote:
Maybe I don't understand why you use this method at all. It seems inferior to other methods. Also I don't think it will help the OP much as you didn't really show any code. I can understand if it's for a commercial project, but showing examples of what you mean instead of just saying, "it's an alternative and is awesome" is a lot less vague.


What you say is true. The method is most certainly NOT superior to a highly optimized bit-packing serialization setup. I tried to say this earlier. It is merely a simple alternative for a beginner, not for a high-end server network sending multi-gigabyte amounts of data. The complete setup that we use in our project is much more complicated. What I posted was just the basics so the OP could get started. I didn't expect to get this rash of hateful responses. I have always found Gamedev to be extremely helpful but this rather shakes my confidence.

Quote:
If you're using structs then where does the bit packing come in that you say you are using? I don't understand what you mean by this. I mean when I think of bit packing I think of things like this where data types are serialized in a format.


The bit packing that we use is NOT data type specific. I apologize for not explaining this better earlier. I was tired when I was posting most of this stuff. It works very similarly to your .tar or .rar archive packing methods. It takes the packet data as a whole and compresses it. This saves time and headache versus trying to pack and unpack each variable at a time. If someone has experience with both ways of doing things and finds one to be faster / more efficient please say so. We just found that packing the packet as a whole works better.

Quote:
There is no "flow". Some people like using structs and others use "binary packets" with contiguous data formats. I don't think there is a difference between server-server communication and client-server if things are set up correctly. I design larger scale multiplayer games, and the cluster server packets are not designed any different than the server-client packets. Maybe you designed something wrong or are over complicating things?


First, the underlying packet system that we use is the same for client-server as well as server-server communications. The difference is the quantity and complexity of the data being sent. Also, another difference I see is that client-server communications (for a game anyway) are going to be sending minimal packets to optimize the use of bandwidth . Plus, in my opinion, clients will not (perhaps should not) send as much data to the server as the servers send amongst themselves. Unless I'm missing something here, aren't the servers going to be sending large complex packets to each other to save CPU time where the clients will be sending quick small packets to save bandwidth?
It's not enough just to program, you have to put your heart in soul into every byte of source in order to show your worth.
Quote:Original post by chessmaster42

The bit packing that we use is NOT data type specific. I apologize for not explaining this better earlier. I was tired when I was posting most of this stuff. It works very similarly to your .tar or .rar archive packing methods. It takes the packet data as a whole and compresses it. This saves time and headache versus trying to pack and unpack each variable at a time. If someone has experience with both ways of doing things and finds one to be faster / more efficient please say so. We just found that packing the packet as a whole works better.


That would be entropy encoding. It's usually located between serialization and socket and possibly encryption.

General network communication will look something like this:
  Application      v Serialization      vPacket construction      vEntropy encoding      v  Encryption      v   Checksum      v    Socket


Entropy encoding is generally by far the slowest part here since if involves lots of branching and bit manipulation.

There's several possibilities on how to implement the above. Entropy encoding, Encryption and Checksum can be merged into single pass algorithm, which uses client-specific information as encryption key, and uses entropy coding as encryption algorithm.

Packet construction is merely involved with breaking down the serialized data into proper packets, possibly splitting them, merging them or otherwise fitting them into packets.

Or, you can layer it, and then use conventional compression such as zlib or lzo for compression, some SHA algorithm for encryption and some CRC for checksum.

All of these however are data agnostic. They operate on raw bytes only, whereas serialization is more concerned with providing a meta structure for those byte, some sort of semantic information.

This in turn improves code by turning abstract bytes into semantic constructs that are understood, and verified by compiler.

This allows one to use richer data structures. For example:
// In line with above visitor serializationtemplate < int Min, int Max >class RangedInt {public:  void set(int newValue) {    if (inRange(newValue)) {      value = newValue;    } else {      throw std::exception("Value of of range");    }  }  template < class Archive >  void visit( Archive & archive, const char * name )  {    // throws exception if value read is not in Min-Max range    archive.visit(value, name, Min, Max);  }private:  int value;};


This approach allows you to use very rich data types that contain information relevant to application layer as well. Same goes for various properties, lists and other containers.

It also completely de-couples what our Archive is. It can be a stream, but it can be SQL query, or a design tool that builds property editor, or it can be a file, or a debug std::cout dump....

This topic is closed to new replies.

Advertisement