Jump to content
  • Advertisement
Sign in to follow this  
Revelation60

Loading unicode text files

This topic is 4167 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, I am having a problem reading a .txt file that has unicode characters in them. Somehow they are read wrongly. The file looks like this: -1.0, -0.7, -0.5, -0.3, -0.1 0.2 Czech Nová hra Nastavení Nápověda .. and even more text. It is a language file that has some information about where to place it on the screen on top of the file. If I read it as a text file, the unicode is read wrongly. When I read the file in binary, the fscanf (pFile, "%f, %f, %f, %f, %f\n", &MenuPos[0], &MenuPos[1], &MenuPos[2], &MenuPos[3], &MenuPos[4]); for the first line fails. I actually don't know a way to read them in binary for I can't check for a spererator or a line end then. What can I do?

Share this post


Link to post
Share on other sites
Advertisement
You can read in binary data using fread().

If this is C++, there are wide-character equivalents to a lot of the standard functionality. ie: wcout, wchar_t, wstring.

Share this post


Link to post
Share on other sites
But I can't detect where a number ends, because it doens't stop reading at a blank character. Example:

20 30

can't be read with fread, for it doesn't stop at a blank character. Furthermore, I don't know the string lengths, so I can't give a range.

Share this post


Link to post
Share on other sites
Quote:
Original post by Revelation60
But I can't detect where a number ends, because it doens't stop reading at a blank character. Example:

20 30

can't be read with fread, for it doesn't stop at a blank character. Furthermore, I don't know the string lengths, so I can't give a range.


First, are you using C or C++? Second, if you're using C, then you just need to fread() one byte at a time, or read more than that and buffer it yourself. Or switch to C++ and use the wide character iostream stuff.

Share this post


Link to post
Share on other sites
Quote:
Original post by Revelation60
If I read it byte by byte, I think I get wrong results, because unicode has got more than one byte for a char.
Yes, you read byte by byte and then parse the results yourself.

Share this post


Link to post
Share on other sites
It would probably help if you knew which encoding your file uses. Read these:

http://en.wikipedia.org/wiki/Unicode
http://en.wikipedia.org/wiki/UTF-8

Once you (globally) understand what you are trying to do, I'm sure you can find a library that does it for you (though writing your own unicode-reading lib can be entertaining too).

Share this post


Link to post
Share on other sites
Quote:
Original post by Revelation60
When I read the file in binary, the

fscanf (pFile, "%f, %f, %f, %f, %f\n", &MenuPos[0], &MenuPos[1], &MenuPos[2], &MenuPos[3], &MenuPos[4]);

for the first line fails. I actually don't know a way to read them in binary for I can't check for a spererator or a line end then.

What can I do?

Use fwscanf.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!