Jump to content
  • Advertisement
Sign in to follow this  
shalrath

Unicode text files

This topic is 4802 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I know this is a fairly open ended question, but I'm looking for a little more direction than the MSVC help files are able to lend. I'm using MSVC7.0(2002) and I simply wish to open a unicode text document and read it's contents into a wchar_t string, does anyone know a quick, simple way to do this, I can expand on it later, I'm just trying to read it in at the moment. Thanks.

Share this post


Link to post
Share on other sites
Advertisement
Why kind of unicode encoding? UTF-8 or UTF-16 (or oddball UTF-32)? UTF-16 shouldn't be an issue, you can pretty much do a binary copy into memory. UTF-8 is more complex, especially if you follow the security recommendations.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
http://www.invenietis.net/Native/UTF8Encoder.htm

There is also a link to a decoder on that page, both a fast version and a slower, secure version.

Share this post


Link to post
Share on other sites
There must be a simpler way... This is great, but I'm not that great a coder, it must be able to be done with standard C++ functions... Can anyone help... I'm currently doing this:


FILE *textf = NULL;
textf = _wfopen(L"mytext.txt", L"r");

wchar_t *mystring = new wchar_t[6];

fgetws(mystring, 5, textf);



However, what I am given back will not display correctly... Can anyone see anything wrong here, or is it likely to be in my text display code?

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Sorry, but there's really not a simpler way. If you want to go from UTF-8 (on disk) to UTF-16 (Windows wchar_t) or UTF-32 (Linux wchar_t), you've got to do the decoding yourself.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Yes, if you write in UTF-16, you should be able to read UTF-16 using your method, with the possible caveat of having to do endianess conversions if your files are coming from a machine with a different byte order.

Also, Windows uses a special indicator at the beginning of the file to indicate it is a Unicode text file: see here

Share this post


Link to post
Share on other sites
Even after saving my text file as UTF-16 (big or little endian) I only get junk back from the fgetws call... Microsoft use a resource to load a unicode text file in their Direct3D text example... but this seems like overkill, I should be able to do it with simple file i/o right?

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!