# Packet byte format

This topic is 2594 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I am curious as to how effective my current method of formatting the bytes in packets for multiplayer games is. I have made basic multiplayer games in the past (multiple users in an environment who can move from a(x,y) to b(x,y) with A* pathfinding on a grid).

I have always formatted my information as such:
First 2 bytes describe how many bytes following are part of the command (unsigned short)
Second 2 bytes describe the op-code for this particular command
Rest of the command's bytes are formatted in the way understood by the particular op-code to perform the command, and are extracted and typecasted appropriately

The networking code can then seperate the commands from the in buffer in order because it is aware of how many bytes each takes up

Multiple commands can be put into one packet, provided the packet is transmitted cleanly, there were no issues with my technique (I was using TCP just to test it out). Of course, the operation doesn't require that it works like a command, for example: If I was sending the player's location in a fps style game it would have 2 bytes for the length of the "command" then 2 bytes for the op-code followed by 4 bytes to describe a player's location at that time. In the same tick on the client, a command would be at the start of the packet to describe the time signature of that particular tick. Data could be collected over a period of 100ms to interpolate the movement without predicting the future (causing a small lag, but it works for the source engine so it must be good enough for me), or it could predict one tick into the future after the 100ms history of locations.

I was hard coding all the op-codes in a switch statement to keep it robust, but I feel I need to do something more adaptable if I intend on making a proper networked indie game.

I understand I could use something like raknet to do all this (and more) for me, but I want to learn everything from the low level to keep it as efficient as possible and so that I have a much greater depth of understanding. Replicating objects and values between devices sounds extremely bloatworthy and hackable - especially if I am unaware of the specifics underneath the library.

Are there any good articles or white papers regarding what I am trying to do? I thought up my method myself because I was unable to find a proper example of how to do it correctly. I will also have to learn how to synchronise threads and communicate between them properly.

/wall of text

##### Share on other sites
Quote:
 I have always formatted my information as such:First 2 bytes describe how many bytes following are part of the command (unsigned short)Second 2 bytes describe the op-code for this particular commandRest of the command's bytes are formatted in the way understood by the particular op-code to perform the command, and are extracted and typecasted appropriately

What is the purpose of the first two bytes? what is the purpose of the second two? The last part is fine.

##### Share on other sites
Quote:
Original post by smasherprog
Quote:
 I have always formatted my information as such:First 2 bytes describe how many bytes following are part of the command (unsigned short)Second 2 bytes describe the op-code for this particular commandRest of the command's bytes are formatted in the way understood by the particular op-code to perform the command, and are extracted and typecasted appropriately

What is the purpose of the first two bytes? what is the purpose of the second two? The last part is fine.

He just told you the purpose, you even quoted it.

smasherprog, the formatting you describe is what I have most often used as well, and what I see used in many cases.

Alternately, you can lead with an op code, and the size can be derived from a struct associated with the op code. The 'data' in the packet from certain op codes can lead with the size, so that for many op codes that you already know the size, you aren't sending a useless 2 bytes, but for other op codes, such as chat messages or messages with variable length strings/data, you still have the flexibility of sending arbitrary length messages for certain op codes that need to.

I think many people learn networking by basically sending the a struct, and then recieving the messages into the struct at the other end. It's the simplest amount of code, but it's also the least flexible/portable.

##### Share on other sites
I was hoping that he could think it out himself instead of just being an answer machine. That is why I asked about those two questions. Since you already suggested it; for my networking, I use your suggestion: lead with the op-code, then pass the rest of the packet to a function that knows how to deal with the rest of the data.

To me, this leads to simpler code: just need a leading op-code (fixed size of 2 bytes is more than adequate), then the remaining size is known to the specific function.

##### Share on other sites
Quote:
 Original post by smasherprogI was hoping that he could think it out himself instead of just being an answer machine. That is why I asked about those two questions. Since you already suggested it; for my networking, I use your suggestion: lead with the op-code, then pass the rest of the packet to a function that knows how to deal with the rest of the data.

The advantage of having size first is that it allows processing of unknown commands (IFF and similar). Aside from trivial error checking it also allows robust transition between versions. Another benefit is that it works over TCP with no modification (handles buffering implicitly)

But perhaps more importantly - diagnostic and debugging. By having explicit payload sizes, the inspection tool can be dumb yet still separate the content, perhaps even log it. With fixed per-message header, this tool can classify and dump from raw stream without knowing the contents - or without performing possibly complex session/protocol/version correlation, such as when diagnosing client upgrade issues or if multi-version clients are allowed in first place.

Quote:
 To me, this leads to simpler code: just need a leading op-code (fixed size of 2 bytes is more than adequate), then the remaining size is known to the specific function.

The downsides are:
- there is only one op-code/message per UDP packet or broken message permanently breaks the stream with TCP
- if protocol changes in any way, or an old-version client connects, it will not be able to interpret unknown or changed messages or even detect them. Without fancy serialization it could also lead to annoying issues, such as data order change means what used to be Y is not string length

Considering minimal 42 byte overhead of TCP, extra size bytes are fairly good investment since they buy forward and backward compatibility and robustness.

Usually, when connection is established, version parameter would be sent as well. While this could be used to synchronize peers to proper version, it doesn't alleviate the other problems.

And for anyone who wants to be fancy, there's always possibility of variable length encoding of payload size (see UTF8 for an example) or perhaps mangling bits of opcode in there as well, or perhaps doing some kind of dynamic huffman scheme to generate opcodes on the fly (might help with replay attacks).

But considering all the noise that is going on over the wires these days in form of escaped JSON encoded via MIME over HTTP, the two bytes won't really show up anywhere.

##### Share on other sites
wow, I didn't realise I was doing it so similarly to how others do. I only recently discovered the struct passing method (havn't used C++ forever). When I first programmed a networked game I had the client in game maker and the server in a php command line daemon (due to the game maker sockets extension pulling data into a string I had to avoid null bytes and special ascii characters in the packets by having no byte value less than 32! But that was years ago and I don't need game maker anymore :P).

I am confident that my technique will suite my needs from now on, I was expecting to be miles off. I think in actual fact I have been sending the command length first - I typed up my thread without my old code for reference and I wasn't sure how I had done it. To tell the truth though, I didn't spot that either way was different from the other (code or length first).

Thanks for the help, now I need to delve into winsock properly - I should make a sockets wrapper first though so I can develop the server for linux and windows platforms with the same socket functions.

##### Share on other sites
I am actually doing the same as you, the difference is that I have 2 bytes for the size and 1 byte of the command.
I thought if I got the command 255 I could always expand another byte.

Makes it easy for me to send strings of varying size unlike a struct, where I probably have to send a larger string buffer even if I only used a small amount of it or do some strange ugly fix.

##### Share on other sites
Now I need to worry about synchronising and controlling a stable tick rate to synchronise the nodes with. And things such as late-joining and application hangs.

Not to mention the multithreaded requirements of a proper networking system. Luckily though, I only need something that will work over a LAN first (TCP should be ok) then I can take a broad look at my design and take what I have learned to design a new system that will be functional over the internet. I know my first attempt will be flawed and I am expecting that.

Here is the contents of a notepad document that I have been putting my thoughts into:

Server-Client synchronisation.

- Tick commands will describe what tick the commands following them occur at, eg:
[tick 1449][player location update][player reload][player location update][tick 1450][player location update][player location update][player shoot]
This prevents a tick value needing to be sent with every command. The networking code can split
the packets up into tick information. If an output packet being prepared becomes too big for the MTU, the
previous tick still stands as the tick for commands at the start of the next packet. What implications
will this have? Provided that this only happens when the connection is unsatisfactory, it could be allowed
to have an effect on gameplay - ie, dropped commands (the tick will already be dealt with on the receiving
end and the information is now out of date). Therefore, certain pieces of information would not need to be
sent at that point - ie, values which are instantly out of date if they are updated on the intended tick for
the packet. Should packets always have the true tick at the start, followed by commands - even a command
dictating the previous tick followed by out of date information from that tick? When the packet is prepared
for sending - anything from the previous tick which is altered in the current tick should be stripped out
and not even sent across the network.

- The simulation (at least the networked component) runs at a specific tick rate.

- Interpolated values (such as moving object locations and angles) are collected over a short period
of time (such as 1/10 of a second, or a number of ticks) before the object is interpolated along the
given points. This creates a minimum lag but all movement/turning should appear smooth.

- All devices (server and client) count the tick by themselves but would ideally be on the same tick
at any given point in real time, though expected lag will mean this isn't fully possible. I am not
sure if the tick will dictate any other things within the game such as a reference point for other
timing - but it is what dictates the network and game logic. The PhyreEngine may not have been
designed with real-time interactive multiplayer in mind.

- The server is authoritative regarding the correct tick that clients should be on.

- A client can drift a few ticks ahead or behind the server (values can be set in configuration).

- If a client's tick drifts too far away from the server's, the client will be "bungee roped" forwards
or backwards into synchronisation - this requires some investigation.

- A late joining client will have its tick adjusted over a short period of time after connecting
when enough information is discovered (lag compensation) to decide what the correct tick should be.
clients could count from 0 and use an offset, but this is not any different from resyncing them.

- Ticking may be calculated on the networking thread or a separate tick calculating thread, this
must be calculated as accurately as possible. Due to scheduler and multithreaded application
quirks and low level issues, only some tick rates will be possible to maintain - this requires some
degree of investigation because the method must be stable and accurate for as long as possible.

##### Share on other sites
Quote:
 Original post by BozeboThanks for the help, now I need to delve into winsock properly - I should make a sockets wrapper first though so I can develop the server for linux and windows platforms with the same socket functions.

While I agree with you about wanting to develop your own server infrastructure instead of using something like RakNet, I think you would be better off using an existing socket wrapper like boost::asio. It handles a lot for you, and gives you cross-platform code, though be warned that you will probably have to do a lot of research before you find examples of asio that do exactly what you want.

##### Share on other sites
[quote "[color="#284b72"]Bozebo"]but I want to learn everything from the low level to keep it as efficient as possible and so that I have a much greater depth of understanding. [/quote]
As he said. He doesn't want to use an external lib.

There is nothing wrong with going low level.

##### Share on other sites
I have always used this type of format:
 typedef struct { U16 u16MsgType; U16 u16Params[2]; U16 u16DataLength; } tPacket; 

This way, using 64-bits, I have a complete packet header, and it will often be all I need for many packet types. The Params will often store all the data I need to pass for that packet, but, if I need more, I will append the data at the end, depending on u16DataLength.

Be aware, if you're passing data in a structure, you obey alignment rules. So, have your 32-bit variables on 4 byte boundaries, 16-bit variable on 2 byte boundaries. If you do this:
 typedef struct { unsigned char Type; unsigned int Data; } tData; 

then there could be 3 bytes of "padding" between Type and Data, since Data must be on a 4-byte boundary for many platforms. instead, you should have Data before Type.

##### Share on other sites

Be aware, if you're passing data in a structure, you obey alignment rules. So, have your 32-bit variables on 4 byte boundaries, 16-bit variable on 2 byte boundaries.

You can just set packing rules also to get rid of that. The overall effect on performance isn't that big of a deal.

I prefer binary writers and readers for handling packets. They allow for the concept of packet construction in a very clean way. I recommend reading the article for an introduction.

Now I need to worry about synchronising and controlling a stable tick rate to synchronise the nodes with.

Much easier than it sounds. Server needs to send a ping packet and create a timestamp. Then the client will get that and immediately in the receive callback send a pong packet. The server will get this and generate a timestamp and subtract the sent timestamp to receive a two way latency. When you send an update to the client send along this latency once in a while. Perform pings every few seconds to get an updated latency. (You can perform statistical analysis on the latencies over time to grab the average if you want). When the client gets an update packet it can use the last known latency and simply divide that value by two to get a rough one-way latency. That means when a client gets an update packet at time t then it knows that the server updated at t.

Here's an example. Server sends a ping at 200 ms and receives from the client the pong at 273 ms. The latency is 273 - 200 = 73 ms. The server updates at 500 ms and sends along the 73 ms value. The client receives this packet and it thinks the server is at 480 ms lets say. It can calculate the current time in the server by doing 500 + 73 / 2 and get 537 ms. Lets say you were using extrapolation in your server code. The client could snap the entities to the current data in the packet which would reflect the time of the entities 73 / 2 = 37 ms ago. In order to directly correspond to the server's expected positions you'd do something like position + velocity * 37 ms to extrapolate the entities. (naive extrapolation. Works good for objects with inertia like space ships).

If you read up on the source networking article you'll see that not everyone uses extrapolation though since it can cause snapping effects. (You're assuming a unit keeps on a constant course for a few ms between the next reply).

Oh and I can't stress this enough. Draw ghost objects showing the object's server-time. Update it exactly whenever you get an update and extrapolate it to match as closely to the expected current location on the server. Performing simple interpolation is usually enough to correctly match the expected position. The problem occurs as you might imagine when the object is changing positions a lot. Also I've tested this and performing client-side collision detection and response makes a massive visual difference than just letting the server deal with it. (Other than the player walking into walls and snapping back). Ideally you should be extrapolating any interactions with the world you can.

Also I imagine for testing you're forced to add fake latency into the problem? I mean over LAN the latency is around 1 ms. How are you accomplishing that?

##### Share on other sites

[color=#1C2837][size=2]Perform pings every few seconds to get an updated latency.
[color=#1C2837][size=2]

You don't need to do that. As long as there is data flowing between client and server, you can piggyback the ping inside those packets. Specifically, you can measure transmission ping, separate from server processing speed, by using a four-way timing calculation (see the other thread on this topic going right now).

##### Share on other sites

[quote name='Sirisian' timestamp='1295033691' post='4758962']
[color="#1C2837"]Perform pings every few seconds to get an updated latency.

You don't need to do that. As long as there is data flowing between client and server, you can piggyback the ping inside those packets. Specifically, you can measure transmission ping, separate from server processing speed, by using a four-way timing calculation (see the other thread on this topic going right now).
[/quote]
Yeah that's what I kind of figured. Also which thread?

##### Share on other sites

Yeah that's what I kind of figured. Also which thread?

Grr. Forums Search seems bustigated right now, and the best that Google comes up with is http://www.gamedev.net/topic/576527-time-synchronization-between-client-and-server-method/
That's not the thread I'm thinking of -- I posted just a few days ago. It has the formula for RTT calculating using two client and two server measurements:
1) Client sends client timestamp C1
2) Server timestamps incoming message C1 at S1
3) Server sends outgoing message at S2, so it includes C1, S1 and S2
4) Client receives S3 at C2.

Now, RTT is approximately ((C2 - C1) - (S2 - S1)) / 2
Clock offset is approximately ((S1 + S2) - (C1 + C2)) / 2

• 9
• 11
• 21
• 10
• 14