std::string vs std::wstring

Started by
11 comments, last by benryves 15 years, 3 months ago
I want to use the wstring version of the C++ stdlib, but I have a couple of worries. First, will the c_str() function work 'correctly', i.e.; will it return a pointer to a zero-terminated array of wchar_t? And will this work in the wide-character win32 api functions? I know that win32 hides whether you're using the wide- or ASCII- character strings by default, but I wanted to know if something like the following is correct;

std::wstring wstr = "hello, blah,blah..."; // for example
...
CreateDirectory(wstr.c_str(),NULL);

Advertisement
Quote:Original post by webwraith
I want to use the wstring version of the C++ stdlib, but I have a couple of worries. First, will the c_str() function work 'correctly', i.e.; will it return a pointer to a zero-terminated array of wchar_t? And will this work in the wide-character win32 api functions?

I know that win32 hides whether you're using the wide- or ASCII- character strings by default, but I wanted to know if something like the following is correct;

*** Source Snippet Removed ***
Yes and yes. std::wstring will give you a const wchar_t* for c_str(), and is fine to pass to Win32 functions.
But don't assume that "---" is a wchar string. In fact, its a regular char string! L"---" creates a long char (wchar) string. I don't know exactly what you might want to google for more info, though. Just C++ Unicode gets a lot of irrelevant pages.
If you really want Unicode independence, you may want something like

#ifdef UNICODEtypedef std::wstring tstring;#elsetypedef std::string tstring;#endiftstring str = TEXT("hello, blah,blah...");CreateDirectory(str.c_str(),NULL);

Similar to what Win32 does for all its functions (i.e., define the A or W variant depending on the UNICODE define). TEXT is a Windows define that prefixes the string with L if UNICODE is defined

Edit:
or typedef std::basic_string<TCHAR> tstring;
Million-to-one chances occur nine times out of ten!
Quote:Original post by Mike nl
If you really want Unicode independence, you may want something like

*** Source Snippet Removed ***
Similar to what Win32 does for all its functions (i.e., define the A or W variant depending on the UNICODE define). TEXT is a Windows define that prefixes the string with L if UNICODE is defined

Edit:
or typedef std::basic_string<TCHAR> tstring;


I used to do that. Unfortunately, after using it for a while, I realized that it actually doesn't create Unicode independence. It actually creates a situation where you have code that works with Unicode or with ASCII, but never both and creates more maintenance overhead than it saves.

What you want, instead, if you want Unicode independence is methods and classes that are templated based on char type (e.g.: char vs. wchar_t) and then you make use of std::basic_string<CharType> in all of your code. You will need to make overloads of some of the common functions, since none of the standard C functions have overloads, since C doesn't support overloading. Such as:

template <class TCharType>void doSomething(const std::basic_string<TCharType>& str){   ...}
You could also use UTF8 strings, which means that for normal (inline) strings, you can still use the "---" notation, as long as you don't use any characters with code > 127 inside your source files. All characters with a code > 127 would be represented using a combination of multiple >127 characters, but it's generally a good idea to not use those in inline strings.

Most string operations, like searching for a specific character, or character sequence still work like with normal char arrays, only some things, like reversing a string, may be a bit trickier.

This is what glib/gtk uses, and it works really well.
Quote:Original post by Rydinare
What you want, instead, if you want Unicode independence is methods and classes that are templated based on char type (e.g.: char vs. wchar_t) and then you make use of std::basic_string<CharType> in all of your code.

I wrote something for char type "independence" after stumbling upon partial template specialization.

There's probably a much more easier way to achieve it but it gave me some practice and more headaches with boost preprocessor anyway. It does require a macro around the Windows function of interest which unfortunately kills intellisense in VS, and tends to clutter up the code if you're doing a lot of string manipulation with the Windows functions.

EDIT: Forum seems to be eating the backslashes in the macros so I've pasted it here

The IfThenElse class is from the Josuttis book:

// copied from C++ Templates: A Complete Guide// template<bool cond, class TrueArg, class FalseArg>struct IfThenElse;template<class TrueType, class FalseType>struct IfThenElse<true, TrueType, FalseType>{	typedef TrueType type;};template<class TrueType, class FalseType>struct IfThenElse<false, TrueType, FalseType>{	typedef FalseType type;};


An example usage
template<class Elem>void GetModuleDirectory(const std::basic_string<Elem>& module, std::basic_string<Elem>& dir){    Elem buf[MAX_PATH] = {0};    HMODULE module = WIN_FUNC(GetModuleHandle)(module.c_str());    if(module)    {        WIN_FUNC(GetModuleFileName)(module, buf, MAX_PATH);        WIN_FUNC(PathRemoveFileSpec)(buf);        WIN_FUNC(PathAddBackslash)(buf);        dir = buf;    }}
Wow, thanks to all for the replies! In actual fact, I intend to just use Unicode, rather than make it compatible with ASCII/whatever, so "independance" isn't too high on my list of requirements. The only real help I needed was with the difference between std::string and std::wstring, and what I needed to watch out for when using the latter instead of the former.
Quote:Original post by adeyblue
Quote:Original post by Rydinare
What you want, instead, if you want Unicode independence is methods and classes that are templated based on char type (e.g.: char vs. wchar_t) and then you make use of std::basic_string<CharType> in all of your code.

I wrote something for char type "independence" after stumbling upon partial template specialization.

There's probably a much more easier way to achieve it but it gave me some practice and more headaches with boost preprocessor anyway. It does require a macro around the Windows function of interest which unfortunately kills intellisense in VS, and tends to clutter up the code if you're doing a lot of string manipulation with the Windows functions.

EDIT: Forum seems to be eating the backslashes in the macros so I've pasted it here

The IfThenElse class is from the Josuttis book:

*** Source Snippet Removed ***

An example usage
*** Source Snippet Removed ***


Interesting. Can you show the code for WIN_FUNC (I assume a macro) and how you tied in the IfThenElse template? Thanks.
It's the first link (pastebin) in my post above.

This topic is closed to new replies.

Advertisement