Jump to content
  • Advertisement
Sign in to follow this  
Gamer Gamester

How to store "unsigned long" character in a string?

This topic is 3803 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm using Freetype to convert fonts to my own bitmap font format for use with OpenGL. Freetype returns character codes as unsigned longs. My bitmap font format inherits from my texture atlas format. This format contains a std::map of a std::string id key, and the corresponding data for all the various individual images (letters in the case of a font) in the atlas. What I want to do is assign this unsigned long to the string, so that I can access the letter by just typing "a" instead of the actual number character code for a. If I just directly assign the unsigned long to the std::string, it works.... for letters with a character code < 255. But if the charcode is higher, they just overwrite a smaller image/letter in the std::map. (So a character with code 0x161 overwrites one 0x061... make sense?) Do std::strings only allow chars of up to 255? If so any ideas for a workaround? Thanks

Share this post


Link to post
Share on other sites
Advertisement
std::wstring looks like just what I needed. Thanks.

I remember reading about wstring once or twice, but most books/articles I've read about C++ just talk about std::string. Are there any differences I should be aware of? And what about converting between them? Is it best to avoid this, and only use either wstring or string throughout your code?

Also, is there a practical performance/memory difference. ( wstring obviously must take up more space, but is it basically negligible under standard circumstances such as just having maybe a few hundred strings? )

Or are there any good articles/references on it? I'm the curious type [wink].

Share this post


Link to post
Share on other sites
std::string is just a typedef std::basic_string<char> string; and std::wstring is just typedef std::basic_string<wchar_t> wstring;<\tt>; so they should behave the same, you also shouldn't have any trouble just using std::wstring everywhere.

Share this post


Link to post
Share on other sites
Quote:
Original post by BeauMN

And what about converting between them?


Conversions aren't recommended, since wstring uses unicode characters (64k different ones) and string uses ASCII characters (256 different). Unfortunately, neither of them explicitly defines locales, so while you can perform by-value conversion, the result if using non-ascii characters may be complete giberish. There's various compiler/platform specific conversions.

But as said, there's no reliable way to convert them. So unless it's for debug output, prefer not to do it without i18n library. For debug however, simple conversion might work.

Quote:
Also, is there a practical performance/memory difference.


Yes. It's specialized for wchar_t type instead of char. Unfortunately, size of wchar_t is not standardized. MVC uses 2 bytes (short), gcc uses 4 bytes (int). This can become a problem if you want to access them directly as arrays, or if you're serializing them.

Quote:
( wstring obviously must take up more space, but is it basically negligible under standard circumstances such as just having maybe a few hundred strings? )


No, it's not really an issue, it's just factor of 2 or 4. And if it is, you'll have to deal with many other problems first.

Quote:
Or are there any good articles/references on it? I'm the curious type [wink].


No, because it's exactly the same class definition as regular string, just template specialization.

Share this post


Link to post
Share on other sites
A couple more questions:

Does the member function size() in std::string and std::wstring return the number of characters or the number of bytes in the string? Different references I've checked say different things on this.

For saving my string data to a binary file, I've always used the size() function to get the number of bytes and the data() to get a pointer to the first byte. This worked fine for std::string as 1 char = 1 byte. But with wchars will it still work? Or will I need to do size() * sizeof(wchar) to get the size in bytes?

Also, what do people here generally use: std::string or std::wstring? I'm asking because, right now, I only really need std::string. I could just drop the characters with a code past 255, as everything I'm making is in English. But I feel that it might be a good idea to get in the habit of using std::wstring. That way if at some point I want to have a program I've made translated into another language, the process will be easier. What are people's opinions on this?

Share this post


Link to post
Share on other sites
AFAIK, size() returns the number of characters, and returns the same value as length(). So, yes, you will need to multiply by sizeof(wchar).

Share this post


Link to post
Share on other sites
Quote:
Original post by Sc4Freak
AFAIK, size() returns the number of characters, and returns the same value as length(). So, yes, you will need to multiply by sizeof(wchar).


Strings have a .length(). It returns same as size(), number of characters.

Quote:
For saving my string data to a binary file, I've always used the size() function to get the number of bytes and the data() to get a pointer to the first byte. This worked fine for std::string as 1 char = 1 byte. But with wchars will it still work? Or will I need to do size() * sizeof(wchar) to get the size in bytes?


This would be generic version that accepts arbitrary string:
template < class CharType, class Traits, class Allocator>
void write( const std::basic_string<C, T, A> & s )
{
typedef typename Traits::char_type char_type;

write_bytes( s.c_str(), s.size() * sizeof(char_type) );
};

Although it should be noted that it's not generic enough, since it doesn't take into consideration many other aspects of string representation that are supported by basic_string.

If you need a fully generic version, you'll need to use iterators. Those might cover everything, but they are considerably slower.

Quote:
Also, what do people here generally use: std::string or std::wstring? I'm asking because, right now, I only really need std::string. I could just drop the characters with a code past 255, as everything I'm making is in English. But I feel that it might be a good idea to get in the habit of using std::wstring. That way if at some point I want to have a program I've made translated into another language, the process will be easier. What are people's opinions on this?


- for most languages western languages there's no need to internationalize.
- internationalization is very expensive from production perspective
- wstring is a very poor way to do i18n, there's good packages that do that
- i18n is not just unicode. It involves either multiple compilation or run-time switching, as well as asset loading. - wstring is costly (in a relative manner), and redundant for everything except UI (there really is no need to translate status logs into n languages)
- some projects use wstring consistently
- many projects rely heavily on C code or C with classes.
- memory constraints

Share this post


Link to post
Share on other sites
Thanks for all the info!

After considering everything I decided to ditch wstring and font glyphs above 255. It doesn't seem worth the trouble for now. ASCII will work fine, no need go overboard.

Thanks again!

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!