Few questions I had after reading the FAQ

Started by
6 comments, last by FurgleFingle 12 years, 4 months ago
Just finished reading the FAQ and though it did a decent job at explaining some aspects there are a few questions I had coming out of it.

In regards to Sending/Receiving packets
In the tutorial you send the structure you wish to send and when you receive it your suppose to receive the union structure for packets. Problem I had when reading that is that a union is the size of the largest structure (if I read correctly online, did some searching to figure this out).

Imagine the following:

union {
PacketTypeA a;
PacketTypeB b;
} PacketType;



PacketType A is 100 bytes, PacketType b is 120 bytes.
The user sends PacketTypeA and then sends PacketTypeA again. The server doesn't get a chance to receive the data between packets reaching it and by the time it gets there it tries to read them into union PacketType.

My current understanding is that the buffer has 220 bytes on it, but when you read for sizeof PacketType you will be reading 120 bytes ( sizeof PacketTypeB ) from the buffer. That means while PacketTypeA may be correct, PacketTypeA in the second packet will be wrong ( it will be missing 20 bytes ).

Would it be better to zero-fill the union instead, fill the appropriate structure inside the union, and then send that instead ?

In regards to data alignment


Tried to also find some information on this subject and came to a wall. I can't find out whether the alignment happens at compile time or at run time (assuming #pragma pack isn't given in the code). If it happens at compile time then sending the structure should be fine to send and receive. If it happens at run time then I'm not sure how to handle this implementation without pragma packing.

Does data-alignment still become a problem when sending to the same OS ? It mostly would be a windows based application and primarily for desktops if that means anything. It probably doesn't (Just like with endianess, my god how I hate endianness >.> )
Advertisement

consider the code ...

union {
...


My current understanding is that the buffer has 220 bytes on it, but when you read for sizeof PacketType you will be reading 120 bytes ( sizeof PacketTypeB ) from the buffer. That means while PacketTypeA may be correct, PacketTypeA in the second packet will be wrong ( it will be missing 20 bytes ).

Would it be better to zero-fill the union instead, fill the appropriate structure inside the union, and then send that instead ?


That is an implementation detail of how you serialize and deserialize data. Making them a union is just one of may convenient ways to simplify converting a buffer of data into a usable data type. You can pack and unpack your data however you choose.

The typical packet is of the form:
[total size][type][additional header info for packet type][packet payload data]

This format is near-universal for network communications.



Tried to also find some information on this subject and came to a wall. I can't find out whether the alignment happens at compile time or at run time (assuming #pragma pack isn't given in the code). If it happens at compile time then sending the structure should be fine to send and receive. If it happens at run time then I'm not sure how to handle this implementation without pragma packing.

Does data-alignment still become a problem when sending to the same OS ? It mostly would be a windows based application and primarily for desktops if that means anything. It probably doesn't (Just like with endianess, my god how I hate endianness >.> )
[/quote]

Alignment is naturally handled by the compiler. That is one of the joys of a high level language: you generally don't need to worry about alignment.

Alignment is not something you generally need to deal with, but it is a serialization concern. Whenever you move data as a collection of bytes rather than as a data type, you can lose proper alignment. You generally need to create a new object (which the program will automatically align correctly) and then push your raw data into that object.

For example, you cannot blindly assume that four bytes are properly aligned for a 32-bit integer. You can instead add the first byte, shift left 8, add the next byte, shift left 8, add the next, shift 8, and add the final byte. Other types, such as a raw byte string, generally do not require proper alignment.
Awesome, that really helps me out :D. Since the alignment is made by the compiler I can stop worrying about loosing alignment across different systems since I will be compiling everything from the same compiler.

There is two additional concerns I have and I am wondering if my though pattern about how to handle them is all wrong. I've heard of cases where data gets truncated (Whether it be a full buffer or something along those lines). And that can lead to fractured data. My current thought process on how to handle that is if it does happen that the server should most likely just read till the end the receive buffer to clear it out and begin anew ( I figured if a packet ends up truncated, there isn't really a way to know when the fractured packet stops and the new packet begins so essentially the whole buffer is ruined until its cleared again). Is this a "ok" way to handle the problem or is there a more efficient solution ?

The last concern is that size of the standard types (int, float, etc.) varies from platform to platform (32 bit to 64 bit). Can that be "fixed" so to say by using elements such as int32_t ?

Awesome, that really helps me out :D. Since the alignment is made by the compiler I can stop worrying about loosing alignment across different systems since I will be compiling everything from the same compiler.

You will lose alignment across the wire. The PC has the (bad?) habit of allowing you to use improperly aligned ints and floats by silently doing a bunch of extra work behind your back.

For integers you can add and shift the bytes back in as described above.

Other types, such as any SIMD registers for MMX/XXM processing, will need to be deserialzed into a properly aligned object.

Often this is done through a memcpy from the unaligned buffer to the properly aligned object. This will work well for any plain data.

You may need to use a more complex system for packing your data. For example, you will need to do something to create c++ objects that have vtables or nontrivial constructors. You may decide to pack collections of flags into bits, pack objects that only require 3 or 4 bits to reconstruct so they take half the space over the wire, etc. I've seen classes that in memory are several kilobytes get reduced down to under 100 bytes by packing them down to their actual valid range of bits.

The details of doing that are entirely up to you.


I've heard of cases where data gets truncated (Whether it be a full buffer or something along those lines). And that can lead to fractured data. My current thought process on how to handle that is if it does happen that the server should most likely just read till the end the receive buffer to clear it out and begin anew ( I figured if a packet ends up truncated, there isn't really a way to know when the fractured packet stops and the new packet begins so essentially the whole buffer is ruined until its cleared again). Is this a "ok" way to handle the problem or is there a more efficient solution ?
[/quote]
TCP is a stream-based protocol. It does not segment your data, you are responsible to ensure that you have all of it before you use it.

If you do a basic read it can simply return all the data that has arrived so far. By default it won't block until the data becomes available. You are responsible to check the return value to see how much data it gave you.

That is why it is important to include the size of your message right up front. You need to accumulate the data until that many bytes are present, then pull them off and process them. It may be that you have less than one full message. It may be that you have multiple messages. It is your job to buffer them and sort that out.


The last concern is that size of the standard types (int, float, etc.) varies from platform to platform (32 bit to 64 bit). Can that be "fixed" so to say by using elements such as int32_t ?
[/quote]

On your Windows compiler the size of an int is 32 bits. It does not matter if you compile as 32-bit or as 64-bit. Very few systems modified the size of an int for their 64-bit compilers; an int is still 32 bits.

On your Windows compiler for C++, the size of a float is also 32 bits.
Let me throw in a simple example of having a packet header, and data you can apply to it. No checking is done here, and I'm may have made a mistake or 2, but it's one example.



// using uint16_t for type and size to keep size down
typedef struct
{
uint16_t MsgType; // type of message, typically defined by an enum, or #defines
uint16_t ExtraData[2]; // Some extra data locations, for small messages that don't have a payload
uint16_t PayloadSize; // Length of the following data
uint8_t Payload[1]; // defined as 1 byte array length so we can access the data here in struct

} tPacket;

tPacket *CreatePacket(uint16_t MsgType, uin16_t *ExtraData, uint16_t PayloadSize, uint8_t *Payload)
{
// subtract 1 for the 1 byte array place holder
tPacket *Packet = malloc((sizeof(tPacket)-1) + PayloadSize);
Packet->MsgType

if (ExtraData) {
memcpy(Packet->ExtraData, ExtraData, sizeof(Packet->ExtraData));
}
Packet->PayloadSize = PayloadSize;

if (PayloadSize) {
memcpy(Packet->Payload, Payload, PayloadSize);
}

return Packet;
}

#define PLAYER_DATA_TYPE 1

typedef struct
{
int Health;
int Armor;
int PositionX;
int PositionY;

} tPlayerData;

// Assume we have tPlayerData PlayerData already filled out
// Create a PlayerData packet and send
tPacket *PlayerDataPacket;
int DataSize;

// no ExtraData here
PlayerDataPacket = CreatePacket(PLAYER_DATA_TYPE, NULL, sizeof(PlayerData), &PlayerData);

DataSize = sizeof(tPacket)-1 + sizeof(PlayerData);
send(ServerSocket, PlayerDataPacket, DataSize, 0);

// When receiving, this is how I'd read the data out
tPacket TempPacket;
tPacket *FullPacket;
int PacketSize = sizeof(TempPacket)-1;

receive(ServerSocket, &TempPacket, PacketSize );

// check for payload; if there, allocate large size for data and get the rest
if (TempPacket.PayloadSize > 0) {
FullPacket = malloc(PacketSize + TempPacket.PayloadSize);
memcpy(FullPacket , &TempPacket, PacketSize);

// receive the payload
receive(ServerSocket, FullPacket->Payload, TempPacket.PayloadSize);
PacketSize += TempPacket.PayloadSize;
}
else {
FullPacket = malloc(PacketSize);
memcpy(FullPacket , &TempPacket, PacketSize);
}

// Handle the packet type however you want; switch, function pointer, whatever

My Gamedev Journal: 2D Game Making, the Easy Way

---(Old Blog, still has good info): 2dGameMaking
-----
"No one ever posts on that message board; it's too crowded." - Yoga Berra (sorta)

Thanks everyone. Frob definitively gave me a lot of answers to the questions I had and BeerNutts ( I really feel weird typing this name lol ) thats awesome having code I can look at so thanks to you both

[quote name='Kyono' timestamp='1323101473' post='4890747']
Awesome, that really helps me out :D. Since the alignment is made by the compiler I can stop worrying about loosing alignment across different systems since I will be compiling everything from the same compiler.

You will lose alignment across the wire. The PC has the (bad?) habit of allowing you to use improperly aligned ints and floats by silently doing a bunch of extra work behind your back.

For integers you can add and shift the bytes back in as described above.

Other types, such as any SIMD registers for MMX/XXM processing, will need to be deserialzed into a properly aligned object.

Often this is done through a memcpy from the unaligned buffer to the properly aligned object. This will work well for any plain data.

You may need to use a more complex system for packing your data. For example, you will need to do something to create c++ objects that have vtables or nontrivial constructors. You may decide to pack collections of flags into bits, pack objects that only require 3 or 4 bits to reconstruct so they take half the space over the wire, etc. I've seen classes that in memory are several kilobytes get reduced down to under 100 bytes by packing them down to their actual valid range of bits.

The details of doing that are entirely up to you.


I've heard of cases where data gets truncated (Whether it be a full buffer or something along those lines). And that can lead to fractured data. My current thought process on how to handle that is if it does happen that the server should most likely just read till the end the receive buffer to clear it out and begin anew ( I figured if a packet ends up truncated, there isn't really a way to know when the fractured packet stops and the new packet begins so essentially the whole buffer is ruined until its cleared again). Is this a "ok" way to handle the problem or is there a more efficient solution ?
[/quote]
TCP is a stream-based protocol. It does not segment your data, you are responsible to ensure that you have all of it before you use it.

If you do a basic read it can simply return all the data that has arrived so far. By default it won't block until the data becomes available. You are responsible to check the return value to see how much data it gave you.

That is why it is important to include the size of your message right up front. You need to accumulate the data until that many bytes are present, then pull them off and process them. It may be that you have less than one full message. It may be that you have multiple messages. It is your job to buffer them and sort that out.

[/quote]

I have examples of this here
http://webEbenezer.n...eceiveBuffer.hh and here
http://webEbenezer.n...rCompressed.hh.

This file
http://webEbenezer.n...c/Formatting.hh deals with the shifting of bytes for big and little endian systems.

The files mentioned and others are available in an archive on this page --
http://webEbenezer.n...ntegration.html .
As with endianness and alignment, don't assume the size of bool and size_t are the same across platforms either. They're not. :)
Always make sure you are dealing with values that are size specific, for example int16, int32, float32 etc.
You can also use float16, which is a special 16 bit version of a floating point number. It gives you less range than the 32 bit version but it uses 2 fewer bytes and can be useful for small range deltas.

This topic is closed to new replies.

Advertisement