Jump to content
  • Advertisement
Sign in to follow this  

loop through array of wchar_t

This topic is 3272 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello Is the following code correct, cause Unicode character is not always 2 bytes long : wchar_t arr[20]=L"Hello"; int num=wcslen(arr); for(int i=0;i<num;i++) do something with arr; [Edited by - elih1 on June 4, 2009 1:53:35 AM]

Share this post


Link to post
Share on other sites
Advertisement
There was a thread on this topic not too long ago. As far as I can recall, C++ has no notion of "Unicode"; wchar_t is a signed short, and nothing more.

Quote:
Original post by elih1
cause Unicode character is not always 2 bytes long


An encoded Unicode character is not always two bytes long. A decoded Unicode character is always an integer of some predetermined size -- generally a wchar_t/short, sometimes an int.

And to answer your question, yes, that code is correct.

[Edited by - _fastcall on June 4, 2009 2:14:06 AM]

Share this post


Link to post
Share on other sites
I will assume you're using windows where wchar_t is 16 bits.

Whether it is correct depends on what you want to achieve.
This counts code values.

If you want to count code points or graphemes, it's wrong.

Quote:
cause Unicode character is not always 2 bytes long

Unicode code points are coded on 21 bits.
Every code point is encoded by one or two code values in UTF-16.

Share this post


Link to post
Share on other sites
Quote:
Original post by _fastcall
There was a thread on this topic not too long ago. As far as I can recall, C++ has no notion of "Unicode"; wchar_t is a signed short, and nothing more.

In standard C++, wchar_t is a separate type from all the other integral types. It's size and whether it's signed or unsigned are both implementation defined. Of the compilers I'm familiar with, only Borland's C++ compilers use a signed 16-bit wchar_t. Other Windows compilers such as MSVC or the MinGW port of gcc wchar_t use a unsigned 16-bit type. On every *nix platform I've worked with, wchar_t is a 32-bit type.

And whether or not the code is correct depends on the encoding wchar_t uses on the platform the code is compiled for and what "do something with arr" actually means.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!