Byte to String/Char Conversion via Network

Started by
6 comments, last by Nypyren 5 years ago

Hi, 

I'm currently developing a client C# (Unity) based application communicating with a C based server. I have a problem in translating the data sent between client and server. So, for example, my client wants to send a character 'E' to the server, so I convert it to byte and use this code


byte[] outStream = System.Text.Encoding.ASCII.GetBytes(message);
serverStream.Write(outStream, 0, outStream.Length);

Then later on the server side, I received with this code


strncpy_s(recvbuf, messageObject.messageBuffer, result);

Both recvbuf and messageObject.messageBuffer are arrays of char. However, when I print the received char (in the first index of the array) by using a simple std::cout, it always print "=" instead of whatever character I send from the client. Any idea where the problem is?

Thanks in advance

Advertisement

For any such networking problems, use Wireshark or similar to capture the network packets. What was actually sent over the network?

9 minutes ago, jt.tarigan said:

I'm currently developing a client C# (Unity) based application communicating with a C based server.

When developing a new protocol, I would recommend sticking to one language until you are completely sure the protocol part works (remember protocol, not entire program. For an existing protocol, say HTTP, test against known good client/server programs). It is also a great example of code where pretty much every line should have automated testing because small mistakes are easy and networking has tonnes of edge cases and problems to solve.

One of the big ones is that TCP is not a message based protocol, and Write/send and Read/recv may not get as many bytes as you expected, may send part of the data, one send() might get split into multiple recv(), even in really annoying ways like splitting a 16bit number in half, or ASCII delimiters like `\r\n` and `\r\n\r\n` (HTTP for example). Always check the return values, never assume it actually sent all of `outStream` or that `recvbuf` got all of it.

8 minutes ago, jt.tarigan said:

However, when I print the received char (in the first index of the array) by using a simple std::cout, it always print "=" instead of whatever character I send from the client.

Either your not sending the right thing, or you messed up your C buffer management and '=' happens to be lying around in memory.

 

11 minutes ago, jt.tarigan said:

C based server.

 

13 minutes ago, jt.tarigan said:

 



strncpy_s(recvbuf, messageObject.messageBuffer, result);

You using C or C++? `strncpy_s` takes 4 parameters, and you don't appear to be checking the return value, so not sure what that is doing. There is very very little benefit to manual string stuff in C++. Also pretty sure `strncpy_s` is the wrong choice here, if I am doing manual buffer management, it is nearly always `memcpy`, and `memmove`, after all to use `send` and `recv` I had to know the lengths anyway.

5 hours ago, SyncViews said:

One of the big ones is that TCP is not a message based protocol

This isn't true at all. TCP in fact ensures that the data you want to send is send in the order it was placed on the socket. If an overflow of the TCP buffer occures, then packages are split anyways and send in multiple packages. It is true that you have to wait until all of your data has been received because such packages may be just parts of your data. Especially when sending a long message and a short message, the short message may occure at the end of the long emssage in the same receive call on server side.

However, you should take a look at the rest of the stream you receive. The '=' character might be someone applies Base64 Encoding on your data when sending from the client side

 

8 minutes ago, Shaarigan said:
6 hours ago, SyncViews said:

One of the big ones is that TCP is not a message based protocol

This isn't true at all. TCP in fact ensures that the data you want to send is send in the order it was placed on the socket.

But it doesn't have any concept of individual message framing, it is only a byte stream. If you want discrete messages, you must manage that yourself, e.g. with delimiters (e.g. \r\n\r\n in HTTP) or by knowing the exact byte length (e.g. HTTP Content-Length for the body/payload).

Just because you call send/Write with ASCII.GetBytes("Hello World"), does not mean a single recv/Read call will get it, maybe you do, maybe you get "Hello WorldWhat is your name?", maybe you get just "Hello", maybe you get "=Hello World" if an '=' happened to get sent just before and you didn't finish with the last message properly.

True, thats why I wrote

2 minutes ago, SyncViews said:

TCP in fact ensures that the data you want to send is send in the order

and not message. Sockets don't know anything about messages regardless if you use TCP or UDP, this is different to the HTTP Protocol but however, if the OP wants to make his own protocol then he/she should do so

It's a rather complex topic, and quickly goes down the rabbit-hole of datagrams versus packets, and concerns over the levels of the OSI layer you're working with.

In general, TCP is stream based.  It's like a file where you read data, and at any point only part of the data may have arrived.  You must account for cases where only part of the data may have arrived.

In general, UDP is datagram based. It's like blocks of data, and datagrams can be fragmented based on network rules. UDP datagrams are different from IP packets, which are also different from Ethernet packets. Like above, you must account for cases where only part of the data may have arrived.

Regardless of the TCP mechanisms, I don't see anything obviously incorrect about the C# code in the original post.

Are you sending an integer first which contains the length of the string?  C# strings won't produce a null terminator with Encoding.GetBytes, so you'll need some other way to recognize how much data to read on the receiving end.

Usually what I do is use a BinaryWriter on the C# side, then implement an equivalent class on the C++ side which understands how to read the data.  BinaryWriter.Write(string) writes a VLQ-style length integer first, followed by the contents of the string - allowing the reader to know how much data to read.  VLQ basically means that it uses the minimum amount of bytes to represent the length as it can - short strings will use one byte length, past some certain threshold will use two bytes for length, etc.  You'd need to check the specific VLQ encoding format that C# uses when writing an equivalent reader on the C++ side.  Or you could just always use a 32-bit integer with a specific endianness.

https://referencesource.microsoft.com/#mscorlib/system/io/binarywriter.cs,166b0572d9c907b3

https://referencesource.microsoft.com/#mscorlib/system/io/binaryreader.cs,2331740401e9cb96

https://referencesource.microsoft.com/#mscorlib/system/io/binaryreader.cs,f30b8b6e8ca06e0f

This topic is closed to new replies.

Advertisement