TXT File conversion UNICODE - ANSI

Started by
6 comments, last by Codorke 18 years, 7 months ago
Hi, How can i convert a unicode-txt file to a ansi-txt file ? I'm using a library thats converting a pdf file to a unicoded txt file and there are no options to change the encryption. How can i read this unicoded txt file or convert it to an ansi txt file ? (C++)
Advertisement
Assuming you have a UCS-2 Unicode file, if you're sure that the text file is just ASCII characters and such you can just feed it to a function like wcstombs() (available in MSVC or non-cygwin MinGW builds). Or if you're feeling more adventurous get a std::ctype<wchar_t> class from a std::locale and use the narrow member function of the ctype.

use_facet<ctype<wchar_t> >(loc)
Lol!

Just open it in Notepad, and save it as "ansi" in the encoding type option :)

I had this same problem (I needed unicode from ansi), and notepad saved my life.
Why would you want to get away? There are plenty of unicode string functions. If you plan on localizations then Unicode is the way to go. Infact in windows most of the time your text is converted to unicode anyway. Except in the title bar go figure.
I want to be able to extract the text from a pdf file (whats already ok through the library i'm using (pdflib)) and show the text in my openGL 3D Book.

But the library only writes the text in a unicoded txt file. So i need to be able to open this unicoded txt file read in the data and then show it in my 3D Book. And this all at run time of the program. (-> can't use nodepad to translate it)

Edited : To show the data in my 3D Book, the data has to be ANSI ASCII.
If you know what type of unicode is being used, as in the number of bytes set per character, then you can just read each character into a byte array of that fixed size and then just read the first byte into a char variable. You would then just use those chars for your display in OpenGL:-)

This works only for English characters, since they only use 1 byte. In unicode, these characters use the same value as ASCII and, depending on the unicode encoding type, the rest of the bytes are just fluff which can be ignored.

But if you use any characters with accents, like most european languages, you will need to read two bytes for those particular characters. Thats important if you have a word like 'résumé'.
I found in the MSDN under the .NET Framework Class Library an ASXIIEncoding Class [C++]. Did anyone worked with this already ?

Liked some suggestions for using it or not ...
Quote:Original post by superdeveloper
Lol!

Just open it in Notepad, and save it as "ansi" in the encoding type option :)

I had this same problem (I needed unicode from ansi), and notepad saved my life.


Can this also be done from inside my application. Something like opening the and giving with some option to resave it as ansii encoded ?

thanks

This topic is closed to new replies.

Advertisement