Jump to content
  • Advertisement
Sign in to follow this  
CoffeeMug

C++ locale questions

This topic is 4892 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I can't seem to figure this one out. Internally in my program I represent all strings as wchar_t based objects. At some points I need to read files that may have different encodings (from simple ASCII file to any other code page one can think of). Assuming I know the encoding of a file, how do I read it using istream_iterator into my wchar_t strings? I tried to use wifstream, but I keep bumping into problems. How do I tell wifstream which encoding the file is in? In particular, how do I do it for simple ASCII file and some other arbitrary encoding (i.e. what's the name of the simple ASCII locale?) Thanks.

Share this post


Link to post
Share on other sites
Advertisement
Quote:
Original post by CoffeeMug
Assuming I know the encoding of a file, how do I read it using istream_iterator into my wchar_t strings?


It wont make a difference to the code of istream_iterator, you just need to give a different character type, in this case wchar_t. Its one of the facet types contained in an locale that preforms code conversion from external to internal (and vice-versa) repsentations called codecvt/_byname, assuming you really want to use istream_iterator for std::basic_string and not istreambuf_iterator then its still:


#include <iterator>
#include <fstream>
#include <string>

int main() {

std::wifstream ifs("foo");

// .....

std::wstring s((std::istream_iterator<wchar_t, wchar_t>(ifs)),
std::istream_iterator<wchar_t, wchar_t>());

}


Use stream's "imbue" method to setup to use a different locale instance.

Quote:
Original post by CoffeeMug
I tried to use wifstream, but I keep bumping into problems. How do I tell wifstream which encoding the file is in?


In most cases you just assign a named locale object (with the wright format string) to a stream, in some other cases you'll want to explicitly set up a some locale object with a std::codecvt_byname facet with different setup for conversion between external and internal representations, in rare cases you might need to make your own codecvt facet by deriving from std::codecvt.

You'll need to check out what character encoding schemes your compiler supports (for std::codecvt_byname).

Quote:
Original post by CoffeeMug
In particular, how do I do it for simple ASCII file and some other arbitrary encoding (i.e. what's the name of the simple ASCII locale?)


You can create locales in different ways:

classic locale - C classic U.S English ASCII, created by invoking: std::locale::classic()

global locale - always available, default constructed locale will be a copy of a global locale, typically set-up to equal classic locale but not always the case and can be changed to something else.

native locale - specified as std::locale foo("") is the one set-up by the user's OS enviroment, may not be equal to global.

named locale - Obviously one setup for a specific location (plus some other attributes), giving the string "C" gives a classic U.S English ASCII locale aswell.

combind locale - a combination of any of the above.

Luckly enough Bjarnes has put the appendix D (Locales) of the book "The C++ Programming Language Special Edition" online here. Not as detailed as the book "Standard C++ IOStreams and Locales: Advanced Programmer's Guide and Reference" of course.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!