Jump to content
  • Advertisement
Sign in to follow this  
JohnnyCode

byte[] array containing 0 byte to string

This topic is 2701 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi,
I have a byte[] array that contains the \0 byte, how do I convert it whole to a string? What options do I have? Thanks.

Share this post


Link to post
Share on other sites
Advertisement
It depends on what type of string you're talking about. If it's a C-string then you can't because with a C-string \0 is the string terminator. If you know the length of the byte array then you could build a string - of sorts - from it, but once again, if it's a C-string then it's a futile operation as you won't be able to do anything interesting or useful with it; as soon as any function that operates on a C-string hits that \0 it'll quite happily stop doing it's thing.

Share this post


Link to post
Share on other sites

Hi,
I have a byte[] array that contains the \0 byte, how do I convert it whole to a string? What options do I have? Thanks.


You could convert it to *two* strings. Ah, hah hah...

But seriously, \0 means a string has ended. So for each zero, you get one string. At least as long as we're talking about C style strings. Pascal strings can contain zeros, so can most string classes found in standard libraries.

Share this post


Link to post
Share on other sites
Thanks guys,
The byte[] array represents an utf8 string, this byte[] array contains utf8 text along with content of a .jpg file (it is a http protocol request, multipart form data). There are some glyphs encoded in two bytes (special cultutre glyphs). An utf8 encoder outputs corect unicode string for me, but it is cut to where file content start.

Share this post


Link to post
Share on other sites
Solved! Actualy the string contains all bytes, the 0 bytes including, along with all other bytes. Just the visual representation gets cut, if I inspect the length of the string it returns higher number than the displayed content.

Share this post


Link to post
Share on other sites

Solved! Actualy the string contains all bytes, the 0 bytes including, along with all other bytes.


This is a completely different problem.

Now - which language is this in?

Just the visual representation gets cut[/quote]Nope. It creates an invalid string. The zeros in those bytes are not optional. This is not about zero-terminated strings.

if I inspect the length of the string it returns higher number than the displayed content.[/quote]Of course, you end up with corrupted chunk of memory.


The bytes here are irrelevant. UTF encoded strings are stored as bytes, but there is no relation between bytes and characters.


Everything else depends on the language, OS, standard library, run-time settings and environment and more.

Share this post


Link to post
Share on other sites
it is C#, this is how I retrieve the string

UTF8Encoding utf = new UTF8Encoding();
string ansimessage = utf.GetString(message, 0, numofbytesred);

variable message is byte[] array I have red on tcp socket. Right after this point the ansimessage.Length reports number higher then length of displayed text in debuger. The string operations I perform after this operate on whole content, not only on the displayed content, for exaple string.IndexOf() and so on.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!