# C++ locale questions

This topic is 4825 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I can't seem to figure this one out. Internally in my program I represent all strings as wchar_t based objects. At some points I need to read files that may have different encodings (from simple ASCII file to any other code page one can think of). Assuming I know the encoding of a file, how do I read it using istream_iterator into my wchar_t strings? I tried to use wifstream, but I keep bumping into problems. How do I tell wifstream which encoding the file is in? In particular, how do I do it for simple ASCII file and some other arbitrary encoding (i.e. what's the name of the simple ASCII locale?) Thanks.

##### Share on other sites
Quote:
 Original post by CoffeeMugAssuming I know the encoding of a file, how do I read it using istream_iterator into my wchar_t strings?

It wont make a difference to the code of istream_iterator, you just need to give a different character type, in this case wchar_t. Its one of the facet types contained in an locale that preforms code conversion from external to internal (and vice-versa) repsentations called codecvt/_byname, assuming you really want to use istream_iterator for std::basic_string and not istreambuf_iterator then its still:

#include <iterator>#include <fstream>#include <string>int main() {   std::wifstream ifs("foo");  // .....   std::wstring s((std::istream_iterator<wchar_t, wchar_t>(ifs)),		   std::istream_iterator<wchar_t, wchar_t>());}

Use stream's "imbue" method to setup to use a different locale instance.

Quote:
 Original post by CoffeeMugI tried to use wifstream, but I keep bumping into problems. How do I tell wifstream which encoding the file is in?

In most cases you just assign a named locale object (with the wright format string) to a stream, in some other cases you'll want to explicitly set up a some locale object with a std::codecvt_byname facet with different setup for conversion between external and internal representations, in rare cases you might need to make your own codecvt facet by deriving from std::codecvt.

You'll need to check out what character encoding schemes your compiler supports (for std::codecvt_byname).

Quote:
 Original post by CoffeeMugIn particular, how do I do it for simple ASCII file and some other arbitrary encoding (i.e. what's the name of the simple ASCII locale?)

You can create locales in different ways:

classic locale - C classic U.S English ASCII, created by invoking: std::locale::classic()

global locale - always available, default constructed locale will be a copy of a global locale, typically set-up to equal classic locale but not always the case and can be changed to something else.

native locale - specified as std::locale foo("") is the one set-up by the user's OS enviroment, may not be equal to global.

named locale - Obviously one setup for a specific location (plus some other attributes), giving the string "C" gives a classic U.S English ASCII locale aswell.

combind locale - a combination of any of the above.

Luckly enough Bjarnes has put the appendix D (Locales) of the book "The C++ Programming Language Special Edition" online here. Not as detailed as the book "Standard C++ IOStreams and Locales: Advanced Programmer's Guide and Reference" of course.

1. 1
2. 2
frob
15
3. 3
4. 4
Rutin
12
5. 5

• 12
• 12
• 58
• 14
• 15
• ### Forum Statistics

• Total Topics
632120
• Total Posts
3004218

×