platform-independent unicode-ascii conversion

Started by
4 comments, last by psykr 20 years, 5 months ago
Since the C++ streams use standard chars to pass filenames, and my entire application is in Unicode, I want to be able to pass a function a wide filename, and then within the function convert it to a narrow filename, which is then passed to the stream. I''ve tried looking it up on the internet, but all I get is WideCharToMultiByte(), which is Win32-specific, and also wcstombcs(), which I believe is also Win32-specific. Is there a standard way to convert between unicode and ascii? I have some code that I found that''s "standard," but I''m really not sure how to use it:
	std::wifstream.imbue( std::locale( "us" ) );
	std::basic_ios::narrow( pwszFilename, 0 ); 
Advertisement
Depends how the UNICODE string you''ve got is represented.

UTF-8 or UTF-7? Copy, omitting all bytes with highest bit set.

Standard 16- or 32-bit words? Copy, omit all words with value > 127.

That''ll strip off the FF FE bytes, if you''ve got them. And yes, you''ve gotta strip out non-ASCII characters, if you''ve only got ASCII filenames. Better idea, write a wrapper that will use different implementations for UNICODE and non-UNICODE OS''s.
RIP GameDev.net: launched 2 unusably-broken forum engines in as many years, and now has ceased operating as a forum at all, happy to remain naught but an advertising platform with an attached social media presense, headed by a staff who by their own admission have no idea what their userbase wants or expects.Here's to the good times; shame they exist in the past.
Boost has a library under review for this I believe, check the Yahoo groups portion of boost (which I don''t have the URL to at the moment).
--God has paid us the intolerable compliment of loving us, in the deepest, most tragic, most inexorable sense.- C.S. Lewis
*scratches head* That first post left me utterly confused.. I was looking for a C++ standard way to convert from Unicode to.. not-Unicode? I''m not sure what it''s called, but just the normal text that you use in C++. ASCII, then. Anyway, the point was that I don''t want to manipulate the string by myself, or use a wrapper for OS-specific implementations (if possible).

And why don''t I use boost? I''m just writing this to learn, and I feel that if I take on another great library before I know how to use the STL and even normal C++ well, I won''t be using Boost to it''s full potential, which leads to code that looks horrifying when looked back on.
*bump* Conversion from UTF-16 to ANSI code page?
Just lookup how UTF-16 is encoded, and how ASCII is encoded, then write a conversion routine for the intersection of the two and try it out.

C++ knows nothing about character encoding beyond the built-in types (char, wchar, int).
--God has paid us the intolerable compliment of loving us, in the deepest, most tragic, most inexorable sense.- C.S. Lewis

This topic is closed to new replies.

Advertisement