Spell checker - how does it work?

Started by
5 comments, last by Domarius 16 years, 4 months ago
Hi, I'm basically trying to implement "spell checker" type functionality, in a program who's purpose is to automatically fix small errors in the source text. (The source text is coming from a character recognition input, which works great except for the odd letter, so this "spell checking" functionality is meant as a final pass to smooth over those little bumps). Has anyone done this before? I've discovered Suffix Trees http://en.wikipedia.org/wiki/Suffix_tree as a way of storing a dictionary of words for fast matching, and also Levenshtein distance, which can determine which word is a closer match http://en.wikipedia.org/wiki/Levenshtein_distance but I haven't yet seen anything close to bringing it all together to make a spell checker kind of thing. I would have thought that kind of info would be more available by now?

For local multiplayer retro themed games, visit Domarius Games.

Advertisement
Or you can simply use a spell checker that already exists, or just read the source code if you want to learn how things work:

Look into GNU Aspell.
http://norvig.com/spell-correct.html has a good explanation of one approach to this, along with some code.

The way it's set up you should be able to adjust it to correct for just the types of errors the OCR produces. For example I'd think an OCR is unlikely to transpose two letters like a human often does.
I believe spell checkers often use TRIE data-structures. And DYNAMIC PROGRAMMING.
Thanks guys :)

d000hg, that's confirming a lot of what I'm finding in my research.

Thanks for the link to GNU Spell fpsgamer, I was thinking something like this should already exist out there :)

Adam_42 that explanation page looks great! I think I will get a lot from that.

For local multiplayer retro themed games, visit Domarius Games.

You can learn about tries here

They also have some articles on DP too, probably spell-checking is mentioned in them as an example.
Thanks, I'm still learning about it :)

For local multiplayer retro themed games, visit Domarius Games.

This topic is closed to new replies.

Advertisement