Archived

This topic is now archived and is closed to further replies.

psykr

platform-independent unicode-ascii conversion

Recommended Posts

Since the C++ streams use standard chars to pass filenames, and my entire application is in Unicode, I want to be able to pass a function a wide filename, and then within the function convert it to a narrow filename, which is then passed to the stream. I''ve tried looking it up on the internet, but all I get is WideCharToMultiByte(), which is Win32-specific, and also wcstombcs(), which I believe is also Win32-specific. Is there a standard way to convert between unicode and ascii? I have some code that I found that''s "standard," but I''m really not sure how to use it:
	std::wifstream.imbue( std::locale( "us" ) );
	std::basic_ios::narrow( pwszFilename, 0 ); 

Share this post


Link to post
Share on other sites
Depends how the UNICODE string you''ve got is represented.

UTF-8 or UTF-7? Copy, omitting all bytes with highest bit set.

Standard 16- or 32-bit words? Copy, omit all words with value > 127.

That''ll strip off the FF FE bytes, if you''ve got them. And yes, you''ve gotta strip out non-ASCII characters, if you''ve only got ASCII filenames. Better idea, write a wrapper that will use different implementations for UNICODE and non-UNICODE OS''s.

Share this post


Link to post
Share on other sites
*scratches head* That first post left me utterly confused.. I was looking for a C++ standard way to convert from Unicode to.. not-Unicode? I''m not sure what it''s called, but just the normal text that you use in C++. ASCII, then. Anyway, the point was that I don''t want to manipulate the string by myself, or use a wrapper for OS-specific implementations (if possible).

And why don''t I use boost? I''m just writing this to learn, and I feel that if I take on another great library before I know how to use the STL and even normal C++ well, I won''t be using Boost to it''s full potential, which leads to code that looks horrifying when looked back on.

Share this post


Link to post
Share on other sites
Just lookup how UTF-16 is encoded, and how ASCII is encoded, then write a conversion routine for the intersection of the two and try it out.

C++ knows nothing about character encoding beyond the built-in types (char, wchar, int).

Share this post


Link to post
Share on other sites