Jump to content

  • Log In with Google      Sign In   
  • Create Account


#ActualEctara

Posted 06 April 2013 - 12:23 AM

A quick and dirty explanation:

UTF-8 is variable width, but is not endian-specific. Great for storage and transmission.

UTF-16 is variable width, and is endian-specific. It's limitations are also is the reason why the Unicode standard restricts to code points less than 0x10FFFF. Avoid like the plague.

UTF-32 is fixed width, and is endian-specific. It is faster to iterate through an array of code points in both directions, but requires more space.


If you need more in-depth information, a dedicated guide would be best.

 

Also, if you are doing your own text handling, avoid wchar_t unless you are dealing with something very close to the system API. Not only does one code point not necessarily correspond to one character, but one wchar_t need not correspond to a whole code point, such as with UTF-16. To make matters worse, wchar_t and wide-char strings aren't required to use UTF-16 or UTF-32. They must be a unit that can hold all characters used by the system. Symbian uses UCS-2 strings.


#2Ectara

Posted 06 April 2013 - 12:06 AM

A quick and dirty explanation:

UTF-8 is variable width, but is not endian-specific. Great for storage and transmission.

UTF-16 is variable width, and is endian-specific. It's limitations are also is the reason why the Unicode standard restricts to code points less than 0x10FFFF. Avoid like the plague.

UTF-32 is fixed width, and is endian-specific. It is faster to iterate through an array of code points in both directions, but requires more space.


If you need more in-depth information, a dedicated guide would be best.

 

Also, if you are doing your own text handling, avoid wchar_t unless you are dealing with something very close to the system API. Not only does one code point correspond necessarily to one character, but one wchar_t need not correspond to a whole code point, such as with UTF-16. To make matters worse, wchar_t and wide-char strings aren't required to use UTF-16 or UTF-32. They must be a unit that can hold all characters used by the system. Symbian uses UCS-2 strings.


#1Ectara

Posted 06 April 2013 - 12:03 AM

A quick and dirty explanation:

UTF-8 is variable width, but is not endian-specific. Great for storage and transmission.

UTF-16 is variable width, and is endian-specific. It's limitations are also is the reason why the Unicode standard restricts to code points less than 0x10FFFF. Avoid like the plague.

UTF-32 is fixed width, and is endian-specific. It is faster to iterate through an array of code points in both directions, but requires more space.


If you need more in-depth information, a dedicated guide would be best.


PARTNERS