Jump to content
  • Advertisement
Sign in to follow this  

How to build a simple text parser

This topic is 2225 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I have been working for a little while now on a text based adventure game in the vein of Zork, and I have gotten to the point where I need a text parser. I naturally went online and looked for a tutorial on building one, but I came up short. I was dearly hoping that someone here could give me the basics on building a text parser and help me get my project going. Thank you.

Share this post

Link to post
Share on other sites
Please be more specific. There are an endless variety of text-parsers that work in ways limited only by your imagination; from command-line parsers to rich-text parsers...

Share this post

Link to post
Share on other sites
I am looking to build a very simple command line parser that will work in a similar manner to that of Zork and take input from the player and respond without the need to hard code all the different possible outcomes. I want to build something very simple that breaks down the string into tokens and then compares those tokens to words that the game recognizes. I do not know how to do this.

Share this post

Link to post
Share on other sites

I naturally went online and looked for a tutorial on building one, but I came up short.

That is likely because what you are looking for is complicated. In short, there is no short and easy answer.

For a very simple text adventure game, you could opt to support a limited subset of input, for example only verb + noun.

Once you start wanting to parse more robust input, things quickly get more complicated.

> Examine the sword
Which sword, the silver one or the green one?
> The green one
The green sword is very sharp.
> Pick it up and put it on the altar of silver
You pick up the green sword and place it on the Altar of Silver Light.

Notice the disambiguation and handling of ambiguous English in the commands.

>Pick it up and put it on the altar of silver

In the above line alone, the parser has to determine that both of the "it" words refer to the sword just picked up, that "put in on" doesn't mean wear something, that "silver" refers to an object named "Altar of Silver" and not an adjective for the silver sword, etc. To handle this type of complexity, you need a parser that can handle ambiguous grammar, and a system built around it that can select the correct parse tree from a collection of possibilities, or othewise ask for clarification from the player in order to select the correct tree. You will also need some sort of database of world objects, along with their names and adjectives, including plural forms, together with a sytem to keep track of item scope (i.e. what objects the player can see, or touch, or pick up).

Zork was created using a domain language called Inform, you can find a lot of info on it here, including a lot of high level info on their parser:

Here a couple links for discussions on the subject:

Here is another take on tackling the problem:

Finally, a while back I implmented an Earley parser (http://en.wikipedia....i/Earley_parser), which is a type of parser that can handle ambiguous grammar, as an experiment in creating a generalized parser for these kinds of games. I can tell you it was not easy, at least for me, since I have had little experience with the field myself.

EDIT: just saw your clarification on "very simple parser", in this case I wouldn't aim as high as Zork, instead try and get a verb + noun system to work first. Edited by laztrezort

Share this post

Link to post
Share on other sites
Perhaps you can, for the sake of learning, abandon the concept of writing an advanced text parser accepting commands in [action][target] form... Perhaps the best way for you to go is to use a "multiple choice" type of system, which means you only have to handle a limited number of cases and user input. You can step things up a notch and "script" your entire game in a basic text file the game just reads... the game knows what to expect as input, things work easily and transparently, everyone wins...

Share this post

Link to post
Share on other sites
Keep it simple for now. Split your dictionary into verbs and objects, you might even want to stick with the typical verbs found in point and click adventures.

"Please use the hammer to hit the nail"

First thing you are looking for is a verb, so look for one.

"Please" is useless and not found in the list of verbs. Move on.
"Use" is found, so from now on you search the object list.
"the" is again useless, not found and skipped.
"hammer" is found, so you got your first object.
"hit" is not found in the object list (though it could be in the verb list). Ignore.
"nail" is finally found and your sentence is complete: "use hammer nail"

Of course the objects might be looked for in the players inventory or the current scene (a global list would require storing the current location for each object).

To "resolve" your input, you could for example just use a nested map, so to define the outcome it could be something like:

someMap3["use"]["hammer"]["nail"] = functionToExecute;
someMap3["use"]["nail"]["hammer"] = functionToExecute;

Why both? Because the user could go with "hit nail with hammer" instead.

Also, there would be maps for just verbs ("look") and the most common 2 word inputs ("examine hammer", "open door"). Unless of course you'd rather go with a tree for parsing, which might seem a bit more natural.

Note that this is a very primitive method and you have to be careful not to have multiple objects with the same name. Also, several words should refer to the same verb or object, also easily done with a map.

verbs["hit"] = "use";
verbs["open"] = "use";

That drastically limits your combinations, but also allows generic inputs like "use chest". You can also go with enums for all your verbs and/or objects. Changing the maps to

verbs["hit"] = VERB_USE;
verbs["open"] = VERB_USE;

Error messages might not always be useful, especially when things are ambiguous.

"Hammer nail into wall". If hammer isn't a verb, the parsing will fail completely ("What?"). If it is, you might get "Hammer with what?", requiring absurd input like "hammer nail into wall with hammer", which can fail if wall is also recognized ("I can't use wall").

So the first decision: do you want to spend a good bit of time on a clever parser or just make it work? Edited by Trienco

Share this post

Link to post
Share on other sites
I am looking for a basic idea for a command line text parser. I am really just looking for a basic idea for now and I will build upon it as necessary. You have helped me all to that extent and for that I thank you.

Share this post

Link to post
Share on other sites
Sign in to follow this  

  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!