Unicode, Japanese IMEs, and Linux equivilants

Started by
5 comments, last by MaulingMonkey 18 years, 9 months ago
After a career of hobby programs of mostly theoretical application, I've started diving into actually building myself a gosh darn honestly useful GUI application and framework. My goal is to eventually build an OpenGL shell. So I banged up a simple application using SDL and FreeType2 in the interests of making sure I understood the FT2 API. I've got it working, kind of. Problem is, I want it to support multiple languages. Like Japanese, which I've been learning to type in romanji using LRNJ, and have gotten notepad, firefox, and the likes working so that I can type "mu" and have ム displayed. My program can display text. My string starts out by proudly proclaiming "Render me!! Testing Japanese: むムム㋰" using MS Mincho (oddly, I can't (or don't know how to) display japanese in Arial using FT2, although it displays fine in notepad (using Arial) - if anyone knows how to do this let me know :-)). But my main problem is that I cannot type Japanese in my SDL application. I'm guessing the problem lies with SDL, as it seems it's unicode support somehow bypasses the IME, since I've been using it's event.key.keysym.unicode fields. Before embarking on a second prototype, I want to make sure I'm on the right track: From my reading it seems I'll need to interpret the wParam of WM_CHAR notifications, and that I'll recieve something like "m <backspace> <mu-unicode-character>" (in seperate messages). Is this correct? I'm also curious as to how this is handled on other platforms, especially Linux. Obviously I won't be getting my input from the Win32 API... do they even have an IME or would I need to code my own? Guess that's all my questions, thanks :-).
Advertisement
For Linux:
Take a look at kinput2 and canna ... they might be able to send unicode events to a program that's listening to stdin. I haven't really looked into it much on the development side, just on the user-side.

This sort of thing is on my to-do list still... right now I'm writing a game to help me learn more Kanji (and reinforce my knowledge of kana). Currently I'm just using a mouse interface to indicate the intended characters... but that wouldn't work in your case because you would have to show all the characters at once and the keyboard interface is much better for what you're trying to do.

You could also just write a lookup-table and just continue to accept ASCII encoded keystrokes, then convert them (or popup several options for conversion) after each kana character has been entered. doing the lookup table for all Kanji would be a little more difficult, but Jim Breen's EDICT ought to help you a lot there.

Good luck with your project.
Greenspun's Tenth Rule of Programming: "Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified bug-ridden slow implementation of half of Common Lisp."
Quote:Original post by void*
For Linux:
Take a look at kinput2 and canna ... they might be able to send unicode events to a program that's listening to stdin. I haven't really looked into it much on the development side, just on the user-side.

Will do, thanks.
Quote:This sort of thing is on my to-do list still... right now I'm writing a game to help me learn more Kanji (and reinforce my knowledge of kana). Currently I'm just using a mouse interface to indicate the intended characters... but that wouldn't work in your case because you would have to show all the characters at once and the keyboard interface is much better for what you're trying to do.

Sounds fun :-). As I mentioned earlier, I've been using LRNJ for teaching myself, using their RPG "Slime Forest". It's very basic (well, from what I've seen of the free version anyways), but for some reason quite fun, probably because I'm easily amused :-).
Quote:You could also just write a lookup-table and just continue to accept ASCII encoded keystrokes, then convert them (or popup several options for conversion) after each kana character has been entered. doing the lookup table for all Kanji would be a little more difficult, but Jim Breen's EDICT ought to help you a lot there.

I'll give that a second thought. I originally decided against a lookup table because I didn't want to implement all the Katakana -> Kanji translations, but EDICT could definately be of help. I could at least use it as an interm solution (this would only help with Japanese, not allow using an IME for any language which would be nice).

I've been looking into this SDL-IM patch, but it dosn't seem to want to work for me, unfortunately (by which I mean, I've got an example compiled and working, but it dosn't seem to be producing sane characters).
Quote:Good luck with your project.

Thanks :-)
*bump* off the bottom of the second page.

I'd still like to know if interpreting WM_CHAR notifications is the way to go (for the Win32 API)...
There's an example of using the IME API in the latest DX9.0c SDK. It should be API independent, though, so there's no reason you couldn't use it with OpenGL.

Allan
------------------------------ BOOMZAPTry our latest game, Jewels of Cleopatra
Quote:Original post by MaulingMonkey
(oddly, I can't (or don't know how to) display japanese in Arial using FT2, although it displays fine in notepad (using Arial) - if anyone knows how to do this let me know :-))
Windows (XP+ anyway, not sure of before) have a concept called "font linking". If you call any of the Win32 textout api's it will kick in. Basically the idea is that some fonts are linked in a chain and if a given character can't be found in the the base font (Arial in this case) it searches the next one down the line and so on. You can find more details in MSDN.

I don't know what FT2 is but if it's some sort of custom graphics engine it may be taking "shortcuts" that fail to take into account internationalization. It's extremely common for apps written by US programmers with no experience doing international stuff to pretty much completely fail once you get out into the wild and away from English (which is pretty much as simple as it gets for text-handling).

-Mike
Quote:Original post by __ODIN__
There's an example of using the IME API in the latest DX9.0c SDK. It should be API independent, though, so there's no reason you couldn't use it with OpenGL.

Downloading now to take a look at it, thanks.
Quote:Original post by Anon Mike
Quote:Original post by MaulingMonkey
(oddly, I can't (or don't know how to) display japanese in Arial using FT2, although it displays fine in notepad (using Arial) - if anyone knows how to do this let me know :-))
Windows (XP+ anyway, not sure of before) have a concept called "font linking". If you call any of the Win32 textout api's it will kick in. Basically the idea is that some fonts are linked in a chain and if a given character can't be found in the the base font (Arial in this case) it searches the next one down the line and so on. You can find more details in MSDN.

"Font linking", got it. I'll do some searching, thanks :-).
Quote:I don't know what FT2 is but if it's some sort of custom graphics engine it may be taking "shortcuts" that fail to take into account internationalization. It's extremely common for apps written by US programmers with no experience doing international stuff to pretty much completely fail once you get out into the wild and away from English (which is pretty much as simple as it gets for text-handling).

FreeType2 is an open source font engine (used by many open source projects - see their screenshots page for a few of them). It directly works with the font file(s), so if the information for font linking isn't included within the TTF file, that would explain why it's not doing this automatically.

This topic is closed to new replies.

Advertisement