Anyone have any experience with IBMs ICU library?

Started by
0 comments, last by wintertime 7 years, 3 months ago

I've been looking into implementing a way to localize strings for my game engine. And one of the problems I ran into was a need for an internal string representation.

I danced around other's implementations to see what they were doing, and how they went about doing it.

Cry Engine appears to support all UTF encodings. But does not do any sort of string operations on them according to the code in their headers.

Unreal uses UTF-16 (Though actually UTC-2 as they only consider the first codepoint), and I can not figure out what the hell they are doing in their code as it seems to dance around all over the place. They apparently use wchar_t... which obviously would be wasteful for the format they are using on a Unix system.

Godot strings supports unicode, but there's no indication of which encoding. I think utf8 as I saw that appear a few times.


And I am personally debating between UTC-2 and UTF-8.

So while doing a bit of googling and contemplating rolling my own string class. I came across IBM's ICU library. Which is widely used and apparently has their own custom types.


I'm not concerned with it being fast or slow as I'm not going to frequently do string operations. But I am concerned about memory usage.

Has anyone had any experience with this library? If so what are some things you can tell me about it.

Advertisement

I'd either use UTF-8 everywhere as its most widely used nowadays or UTF-32 for the simplest string operations, because UTF-16 is an abomination for backwards compatibility, while UCS2 is outdated and you should never bother newly implementing something which only works for a few characters.

This topic is closed to new replies.

Advertisement