Sign in to follow this  
TommiL

Terminating UTF-8 strings

Recommended Posts

TommiL    122
Hi Does anyone know if UTF-8 strings can be terminated somehow in a byte stream or should the string length be explicitely stated? regards, Tommi

Share this post


Link to post
Share on other sites
Erik Rufelt    5901
Just add a zero at the end, that's how strings are usually terminated. Unless I'm misunderstanding something? =)
Like if you want to write "Hello", then add a zero at the end, like "Hello\0".
UTF-8 is backwards-compatible with ASCII, in that every ASCII string is also an UTF-8 string.

Share this post


Link to post
Share on other sites
TommiL    122
Thank you for fast answer. This is the first thing what I tried but at least .NET UTF8 decoding keeps decoding and you can see the '\0\0\0\0' characters in the result for some reason. Does someone know how to decode this correctly with .NET?

Share this post


Link to post
Share on other sites
TommiL    122
Currently I solved it as follows: Without the explicit check of '\0' in decode the string will always contain the maxLength amount of characters despite the termination byte...

 

public static int Encode(ref String str, byte[] bytes, int index, int maxLength)
{
if (str.Length > maxLength)
{
str = str.Substring(0, maxLength);
}
Encoding.UTF8.GetBytes(str, 0, str.Length, bytes, index);
return index + maxLength;
}

public static int Decode(ref String str, byte[] bytes, int index, int maxLength)
{
str = Encoding.UTF8.GetString(bytes, index, maxLength);
int terminationIndex = str.IndexOf('\0');
if (terminationIndex > -1)
{
str = str.Substring(0, str.IndexOf('\0'));
}
return index + maxLength;
}


Share this post


Link to post
Share on other sites
hplus0603    11347
In this particular case, the decode function will take all the bytes that you have. Thus, you have to precede the bytes of the string (when you write them out) with a length. Alternatively, you can scan for a terminating zero yourself, and tell the decoder to only use as many bytes as are before the terminator.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this