Sign in to follow this  
Daher_Q3A

wchar_t all the time... is this wise?

Recommended Posts

This isn't related to game programming. It is more like a general programming issue. I am doing a little low level system. I am using wide-charecter strings now. all my strings are saved and passed using wide-charecter strings. However the change here is that I wrote the code in C++ ignoring the backward compatability for old C Null terminated strings... (char *)... do you think this is unwise? Now after having more like 10000 lines of code with 'wchar_t' everywhere. I thought I'd ask for a second opinion, and a third ... here is a little example of my code
struct name_equals_to
    {
    bool operator ()( format::file_info& _FileInfo, const wchar_t * _Name )
        {
        if( ::wcsncmp( _FileInfo.specialName, _Name, format::constant::MaxFilename) == 0 )
            return true;
        return false;
        }
    };



of course I've written the file_info struct with special name as a wchar_t array... My code will likely to be run on windows and linux, and both does support wild charechter strings... Anyone thinks that I should write the code back using a macro for char/wchar_t replacement?

Share this post


Link to post
Share on other sites
Quote:
Original post by Daher_Q3A
My code will likely to be run on windows and linux, and both does support wild charechter strings... Anyone thinks that I should write the code back using a macro for char/wchar_t replacement?

Perhaps not a macro, but a typedef certainly wouldn't hurt. And it could help if you ever need just normal old char behavior.

CM

Share this post


Link to post
Share on other sites
I think it really depends on the intended audience of the application. I mean, you don't really need to use the wide character set unless you are going to support different languages. If you do plan on supporting multiple languages, you don't really have much choice so the use of wide characters everywhere is fine.

Share this post


Link to post
Share on other sites
Quote:
Original post by TheBlackJester
I think it really depends on the intended audience of the application. I mean, you don't really need to use the wide character set unless you are going to support different languages. If you do plan on supporting multiple languages, you don't really have much choice so the use of wide characters everywhere is fine.


yes you made it. the application is intended for initially going to support english, dutch, chinese... but my idea was that win9x will no longer be a target for example...

and this way, I thought the app code will be cleaner...

Share this post


Link to post
Share on other sites
Nothing wrong with wide character. Picking a native string type is an over-complicated and exceptionally tedious business. wchar_t gives you support for UTF-16 with is the native format of the Windows NT system and covers pretty much every character you might want to represent. Even wchar_t is a halfway house though: It doesn't include EVERY character, the current complete unicode character is 32 bits wide.

Support for Windows 9x is not going to be terribly important for long but you can either use the "MS layer for Unicode" or convert strings yourself using the myriad of portable UTF to MBCS functions available. Linux is slightly trickier as the kernel was converted to unicode. They chose UTF-8, basically a multibyte system to support unicode. Thus to build on Linux you will need to convert your strings using the afore mentioned functions.

Conversion is much easier if you either find or write a string class to handle conversion. This is obviously not the only take on string handling but I have found it to be effective.

Share this post


Link to post
Share on other sites
Quote:
Original post by zoggo
Nothing wrong with wide character. Picking a native string type is an over-complicated and exceptionally tedious business. wchar_t gives you support for UTF-16 with is the native format of the Windows NT system and covers pretty much every character you might want to represent. Even wchar_t is a halfway house though: It doesn't include EVERY character, the current complete unicode character is 32 bits wide.


Well that advise only applies on the PC, which you might have meant without specifying. On OS X wchar_t is 4 bytes and is of the type UCS4 not UTF-16. Only use wchar_t if your programming for the PC only.

Otherwise use your own type.

Cheers
Chris

Share this post


Link to post
Share on other sites
Quote:
Original post by Erzengeldeslichtes
Quote:
Original post by chollida1
On OS X wchar_t is 4 bytes and is of the type UCS4 not UTF-16.

So in such a situation, does L"Blah" become 4 bytes per character or 2?


On OS X it becomes 4 bytes per char and on windows its 2. On Linux its up to the OS.

Cheers
Chris

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this