I'm trying to find a (human) language parser and need help.

Started by
6 comments, last by dmatter 6 years, 10 months ago
I have some friends that are blind. They are really big into gaming and obviously struggle. I was telling them about an old Mac game that I have been wanting to remake, Scarab of Ra, and they love the sound of the game. The thing holding me back are the graphics. After talking to them, it hit me. The game doesn't need graphics! It would be trivial to turn it into a text adventure game. The issue is that the game and levels are dynamically created, so none of the text adventure makers will work here. All text adventure game creators are designed to present a written story and none allow for dynamic stories.

My idea is to have the game run as normal on the back end, but have a text based interface integrated with Window's accessibility API and 3rd party screen readers so that it can be played by people with visual impairments. The issue here is with the command parser. This is non-trivial, especially if you need more than a simple "Move north" command structure. I am having a really hard time finding much about this online as most results take me to computer language parsers. I would like to use a library, if one exists, rather than reinventing the wheel, but I am not even sure of the search terms to use.

Any help would be appreciated!
Advertisement
If you want to be super rigorous about your parsing, you should search for:

Natural Language Parsing
Chart Parser
GLR Parser
Context-Free Grammar
Unrestricted Grammar
Lexical ambiguity
Grammatical ambiguity

If you just want something simple that works, and you don't particularly care about understanding more than a small subset of words, then you can usually get away with "look for expected subset of valid terms" parsing that most of these games have used.

I'm working on several of these things as a hobby, so if you are interested in the rigorous approach I could explain further - the hard parts, the tedious parts, a general overview of what I do, etc.
I may need to dial it back. I'm WAY out of my league.

Still, it has given me some direction, so thank you!
It's not THAT bad. All parsing code I've seen boils down to two things:

1. Precomputing what words are allowed to come next in a sentence based on how they can be combined (called "parser generators")
2. Once I see a specific pattern of words (defined by the grammar), combine them, and repeat this recursively until I have a single chunk left, or a word appears which is not allowed to appear in this position in the sentence (shift-and-reduce parsing).

Most of the special cases in shift-reduce parsing can be removed. My parser no longer cares about the differences between terminals and nonterminals. shift-shift and shift-reduce in the same cell just performs ALL operations instead of having a comflict. It doesn't use a goto table. It doesn't even have a stack anymore (because it needs something else instead). It's still pretty simple.


The hard part *in my opinion* is that there are two HUGE pieces of data you need, and those are the most time consuming part of the whole thing:

1. You actually need a usable grammar for English (or whatever language you want). Surprisingly I have not been able to find one defined as a formal grammar.
2. You need a lexicon (a dictionary) of words you want to handle and all of the variants of what each one can mean.

It's so time consuming that the typical approach that I see researchers trying to take is writing programs to "learn" the grammar and words by feeding them examples (books, articles, etc). I think that could eventually work, but based on the typical caveats in these research papers about their learning systems becoming "only 87% accurate*" I think it is actually worth it to carefully do the entire thing by hand instead.

* "accuracy" meaning they have some unfortunate person parse the whole book by hand first and then compare it to what the program comes up with.


However! If you're making a game, you can get away with a much smaller set of words and grammar rules than a full language would use.
Another thing I think is useful to consider is that a person can figure out the MEANING of a sentence even if it's completely devoid of grammatical correctness. Take this for example:

"a alphabetically be in organised sentence should words"

(From: http://bash.org/?609327 )

Just the existence of the words themselves causes you to imagine ways that they could be properly combined, based on your assumption of what the original sentence's intent was. This is something that the post-parse phase would have to do ANYWAY to make decisions about sentences that have multiple possible interpretations.

My suspicion is that the human mind just brute forces confusing sentences by simply dumping all of the words into a complete graph and then pruning out "ridiculousness" in a massively parallel operation.

Maybe this some kind of offtopic but blind people could also work using a PC so why not let them do play games at the same "level" of complexity as working? As seen in this youtube video

the guy uses some kind of device that helps him navigating through the computer interface. Now what I have seen some times ago was this video
of a text rendering engine using ASCII characters to develop graphics from meshes.

Maybe someone gets out a handicaped player device in the size of a graphics tablet on the market in the future based on ASCII type interfaces or it is a kickstarter idea :D

On the parser topic, it is quite simple to get data out of whatever is used as source, the difficulty is what do you do with that data. I have written a lot of parsers in the past years and ever come down to these steps where tokenization your source to filter out characters you wont need to build valid synatx is the first step. In code parsing I prefer also to use a grabber component that fetches comments and other code so a tokenizer or maybe preprocessor wouldnt have that much computation any more. Build rules from the tokenizer that gets valid bricks of code and type them for the parser. In this step it may be important for example to identify nouns, verbs and whatever kind of "components" out of a language you are interested in.

After getting the code base from your tokenizer set up the syntax tree in parser step. Here again define rules how certain "kinds" of tokens could be combined where special rules go over general rules.

Once completed your parsing process you have data in a form that an interpreter could work with traversing the AST

If Python is no problem, you may want to look into NLTK (natural language tool kit), at nltk.org

Use the cloud, just pick your favourite cloud provider and use their natural language APIs.

Azure Cognitive Services: LUIS

Google Cloud: Natural Language API

Amazon AWS: Lex

Of those, both Google Cloud and Azure offer a free tier that's based on a monthly usage cap (stick below that and it'll be free). While AWS's free tier is only free for a year.

Also, Google's Natural Language API seems to be a slightly different creature to the other two. It seems to be about just breaking down natural language into a set of pre-defined (by Google) entities like "people" and "places" so it would require more work on your part to further process that output into a set of intents (i.e. game actions). Whereas both Lex and LUIS work to a language model that you provide and then they can respond with the best-fit intent/entity automagically.

Based on all that, I would opt for Azure myself.

This topic is closed to new replies.

Advertisement