Jump to content
  • Advertisement
Sign in to follow this  
n00body

gcc unicode

This topic is 4331 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm currently trying to compile a wxWidgets app in Code::Blocks through MinGW, and it keeps complaining about string conversions between the Unicode wxString, and the non-Unicode char[n] strings.What is the compiler flag needed to make GCC compile my executables with Unicode support?

Share this post


Link to post
Share on other sites
Advertisement
Quote:
Original post by n00body
What is the compiler flag needed to make GCC compile my executables with Unicode support?


GCC implementes the C++ language, which has no concept of representational codesets. For that reason, there isn't a flag available to let the compiler choose between the representational codesets your runtime users use.

I suspect the problem is that wxWidget's wxString may be a wide-character string type, so there is probably no conversion to a char-based multibyte C-string type.

I'm not familiar with wxWidgets, but I'm willing to bet that the wxString API provides a means to convert its contents into something that will fit in a char-based multibyte C-string container. If wxString is a container for Unicode data, the kind of Unicode that fits in a char-based multibyte C-string container is called UTF-8, so look for some member function of wxString that has a variant of that in its name (eg. utf8(), toUtf8(), etc).

--smw

Share this post


Link to post
Share on other sites
If you really have to convert strings to char* and can't just change all the code to correctly use Unicode: wxString::mb_str() will give you a wxCharBuffer with a multi-byte representation of the string, defaulting to whatever character set wcstombs() uses; wxString::mb_str(wxConvUTF8) should give you a UTF8 version. wxString::ToAscii() will give you a wxCharBuffer with an ASCII representation, and undefined behaviour if the string has non-ASCII characters.

(The wxCharBuffer is just like a char* which frees its memory when it goes out of scope, unless you use wxCharBuffer::release() to steal the pointer.)

Share this post


Link to post
Share on other sites
Quote:
Original post by Excors
If you really have to convert strings to char* and can't just change all the code to correctly use Unicode: wxString::mb_str() will give you a wxCharBuffer with a multi-byte representation of the string, defaulting to whatever character set wcstombs() uses;


Hmmm, you might be SOL on that, since (IIRC) wctombs() doesn't work under MinGW32 (or Cygwin).

--smw

Share this post


Link to post
Share on other sites
I don't know if this will help but I often use a string convert function I made:

template <typename T>
size_t StrLen(const T* Str)
{
size_t Len;
for (Len = 0;*Str++;++Len);
return Len;
}

template <typename Dest, typename Src>
std::basic_string<Dest> StrConv(const std::basic_string<Src>& Str)
{
std::basic_string<Dest> Ret(Str.length(), (Dest) '\0');
for (size_t i = 0;i < Ret.size();i++)
Ret = (Dest) Str;
return Ret;
}

template <typename Dest, typename Src>
std::basic_string<Dest> StrConv(const Src* Str)
{
std::basic_string<Dest> Ret(StrLen(Str), (Dest) '\0');
for (size_t i = 0;i < Ret.size();i++)
Ret = (Dest) Str;
return Ret;
}



EDIT:
Example:
wxString s = StrConv<wchar_t>("hello world").c_str();

Share this post


Link to post
Share on other sites
Quote:
Original post by meisawesome
I don't know if this will help but I often use a string convert function I made:
*** Source Snippet Removed ***


Dude, that's seriously broken.

You're using a C-cast to convert the bit pattern stored in a char into a bit pattern stored in a wchar_t.

If you have a normal everyday garden-variety multibyte C-string you will find things will work swimmingly for typical North American english usage, since mapping 7-bit ASCII to, say, UTF-16 will appear to work on all compilers I've seen on the Intel architecture. It will just happen to work, but not by design.

It will not work under any other circumstances.

At the very least rather than C-casting the bit pattern from one container to another, you should use the current locale's ctype facet's widen() function, but that's probably not what you want either. You want to use mbstowcs().

--smw

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!