Jump to content

  • Log In with Google      Sign In   
  • Create Account


Natural Language Parsing


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
22 replies to this topic

#1 GeniX   Members   -  Reputation: 114

Like
Likes
Like

Posted 10 April 2000 - 11:07 PM

Hi, I had an idea for creating characters in games with whom the player can converse. I am interested in NL parsing for this. If the computer could parse and pull apart an entered sentance, breaking it up into parts, it could look up in a small dictionary in the game engine the ''meaning''s. ex: Take the axe, and chop off the goblin''s head. The parser would (in theory) locate (this is very simplistic example): Sentance 1 Do-er: Self Action: take Obj: axe Sentance 2 Do-er: Self Action: chop Obj1: axe Obj2: head Obj2 Adj: Goblin''s ... Thus, you see what Im looking at. Most of you who have played games before would think of some of the later text adventures where the parsers got pretty damn complex! I just need some pointers to information on how to split up the english language and "parse" it. (developing a mini-scripting language similar to Prolog or such may be the key for allowing easy but configurable language rules). regards, GeniX

Sponsor:

#2 edA-qa   Members   -  Reputation: 122

Like
Likes
Like

Posted 11 April 2000 - 12:29 PM

I would suggest an introductory book on linguistics, in the are of syntax (maybe with semantics). There should be references that basically break English into a parse tree, much like this

Phrase = Subject, Verb, Object, (Conjunction, Phrase)*

You will likely want to take a reasonable subset of the complete list, for there are several deviations from a standard and English has some anomolies.

Be aware that if you wish to port the engine to different languages these rules will change.


#3 Jim_Ross   Members   -  Reputation: 122

Like
Likes
Like

Posted 11 April 2000 - 01:06 PM

Hmm.. the subject of imparitive sentences is implictly "you", so you''d have to take those sentences as if the person was talking to the computer. Same overall effect, I just have a grammar thing.

In imparitive (spelling?) sentences, the verb is always first "Look out!", "Give me that.", unless it''s a negative sentence, then look out for Do nots and Dont''s, but I doubt people would be typing in stuff like "Don''t walk on the lava" in a video game. For merely controlling the Player Character (PC) this sort of expectation of sentences is sufficient, but actually conversing with NPCs may be quite an impractical feature for a video game. But if you''re willing to tackle it you should look at the source for other language parsers to see how they split up the language.
http://ciips.ee.uwa.edu.au/~hutch/hal/
MegaHAL might have it''s source available.

#4 GeniX   Members   -  Reputation: 114

Like
Likes
Like

Posted 11 April 2000 - 08:44 PM

I was more looking to break up sentances typed to a "simulated computer" which would then respond.

It would make RPG''s and such a lot more interesting.

Jason Hutchens page with MegaHAL didnt impress me that much. His target is a more general conversation bot which is more orientated toward generating sentances. Im looking more to give a fake illusion of understanding in a situation.

ie: Inn Keeper vs Player

Player: Hi there.
Inn: Good Morning sir!
Player: May I get a room?
Inn: We have 4 rooms free at 8 each.
Player: Give me a room.
Inn: Here you go, sir. Room number 5.

--

Breaking up the players sentace would help the "NPC" to get a better idea of what verb is referencing what object and such... then the Inn Keeper could appear to "understand" the player. Ofcourse situations where the player asks something totally unrelated to the Inn Keeper would generate some kind of "I dont know what youre on about" response.

If anyone knows particularly of web sites which may have info on such topics, please post here.



regards,

GeniX

#5 Kylotan   Moderators   -  Reputation: 3329

Like
Likes
Like

Posted 11 April 2000 - 10:58 PM

You need to consider generating a ''dictionary'' for your game, of all the verbs, nouns, adverbs, adjectives, etc. This allows your parser to make a much better guess at which parts of the sentence are what. You can then strip out the fluff (articles like ''the'', or ''a'', generally), validate the sentences (certain verbs will need no object, some will need one, some also need an indirect object), and correspond that to some sort of look-up table in your game that corresponds to game logic, usually some sort of NPC knowledge base. To be able to ask questions requires a different sentence structure (and therefore slightly different parsing order) than for statements, so looking for a question as the first thing you do could possibly simplify it.

Also - adjectives and nouns don''t necessarily have to be treated separately, as they are both just variations on ways to identify that item from others. The noun specifies type, the adjective specifies appearance etc, but as long as it distinguishes that item, you have as much info as you need.

Look up recursive descent parsing somewhere - it''s more commonly used for interpreting scripting languages, but you should be able to use some of the ideas for parsing English or any other language.

#6 MENTAL   Members   -  Reputation: 382

Like
Likes
Like

Posted 12 April 2000 - 02:45 AM

There''s one thing that no-one has thought of yet - spelling. It wouldn''t look to realistic if the player said "Giv the axe to me" and the keeper said "I have no idea what your talking about". Once you have a complete dictionary, I suggest that you modify it slighly to include slightly mispelt words, but giv (sorry) them the same meaning as the correct ones.

Don''t bother about it now, include it when you have the rest of the game working.

#7 MikeD   Members   -  Reputation: 158

Like
Likes
Like

Posted 12 April 2000 - 04:23 AM

Regarding mis-spelt words, you could use some heuristic to compare an unknown word with all the words in the dictionary and try to guess the meaning. You could cut down the search by analysing the type of word you''re expecting (verb, noun etc) and search those word lists first.
So if "Give me a sord" matches a verb phrase best then the computer knows that sord is most likely a verb, searches the list of verbs comparing word length, word ordering and word fragments against the verb dictionary and hopefully come up with the word sword as the most likely answer.
You can then use the fact that the word was mis-spelt to ask the player a question. "Did you say you wanted a sword?".
None of the above is difficult it''s just getting the mix right.

Also, it would be useful to have many different examples of each phrase type, from
"I would like a sword"
to
"Ug want sword"
As long as these didn''t conflict or confuse other phrases.

Mike

#8 MartinJ   Members   -  Reputation: 122

Like
Likes
Like

Posted 12 April 2000 - 12:30 PM

Another thing to remember: most people like to use pronouns in communication. They are easier to type than always using the proper, longer, noun. Also, your NPC needs to remember the things that were said to it. At least for a short time. I had an old text adventure game that used natural language very well. It understood context when it came to individual words. Some words could be both a verb or a noun. An example is the word "check." The check is in the mail. And, I will check the mail. The game understood the difference between these two sentences. Any dictionary you make would have to include how the word can be used as well as the definition for each use.

Making friends one burger at a time.

#9 GeniX   Members   -  Reputation: 114

Like
Likes
Like

Posted 12 April 2000 - 04:09 PM

Thanks.

Altho a lack of references, the postings have been useful.

I had not considered spelling mistakes :-)

A dictionary of words - either associated with actions, or objects in the game-world would be ofcourse a nessecity.

Maybe even having the language ''rules'' not too strict. Thus if a sentance is entered with slightly incorrect grammar, the parser would try to find the closest matching rule or such.

A recent ''history'' is also a must. It would be nice if not only could the NPC match the pronoun to the last spoken about object (with gender), within a short period of time ofcourse, but also if the NPC could also use pronouns in its responses.
May seem more realistic.

Still, does anyone have any texts/URL''s which may help me out with this kind of stuff?




regards,

GeniX

#10 aDasTRa   Members   -  Reputation: 122

Like
Likes
Like

Posted 12 April 2000 - 05:48 PM

Andre Lamothe has written some stuff on this topic. The articles can be found with his latest book, Tricks of the Windows Game Programming Gurus, and with the older Teach Yourself Game Programming in 21 Days. I have used his solutions in a couple of very different contexts and they work well. I don''t know if the articles are available separately from the books, however. If you are stuck in finding them, i think i have one in HTML i could email to you. Just mail me @ rjbianco@home.com if you want it...

<(o)>

#11 Kylotan   Moderators   -  Reputation: 3329

Like
Likes
Like

Posted 12 April 2000 - 09:20 PM

quote:
Original post by MikeD

Regarding mis-spelt words, you could use some heuristic to compare an unknown word with all the words in the dictionary and try to guess the meaning. You could cut down the search by analysing the type of word you're expecting (verb, noun etc) and search those word lists first.
So if "Give me a sord" matches a verb phrase best then the computer knows that sord is most likely a verb, searches the list of verbs comparing word length, word ordering and word fragments against the verb dictionary and hopefully come up with the word sword as the most likely answer.

I think this would start to encroach upon CPU usage somewhat. Especially if you are dealing with statements, orders, and questions, as you have different words orders in each this makes it harder to guess what to expect. I think it might not be worth the effort to support misspellings.
quote:

You can then use the fact that the word was mis-spelt to ask the player a question. "Did you say you wanted a sword?".

This starts to get more complex. The NPCs now have to be able to construct meaningful questions and phrase them properly. Understanding text is one thing, but putting it together is another. You would probably have to do one response for pretty much every verb that NPC could understand. And for what? Why waste your players' time asking them if they wanted a sword, if your search function had shown it to be pretty obvious in the first place - so much so, that the NPC suggests it?

Not to mention that the player's instinctive response will be "yes", meaning that you now have to encorporate some sort of 'history', or a conversation context, and to be able to combine that with the current sentence to obtain the meaning... this is not trivial even for advanced NLP applications and is not really practical for a game. It is almost definitely overkill too.

quote:
Also, it would be useful to have many different examples of each phrase type, from
"I would like a sword"
to
"Ug want sword"

Every extra phrase type supported makes it more difficult to accurately guess which type of word comes next, as you don't know what phrase type it is before analysing it. Therefore this reduces the chances of an accurate interpretation rather than increasing it.

Edited by - Kylotan on 4/13/00 3:22:55 AM

#12 Kylotan   Moderators   -  Reputation: 3329

Like
Likes
Like

Posted 12 April 2000 - 09:28 PM

quote:
Original post by GeniX

Maybe even having the language ''rules'' not too strict. Thus if a sentance is entered with slightly incorrect grammar, the parser would try to find the closest matching rule or such.

Problem - although you may think this makes it easier to code or use, you would most likely be mistaken - English is a very redundant language and there are usually several ways of saying something. So just changing 1 word or rearranging 2 words in the grammar can mean totally different things - so you are more likely to misinterpret it. Example "you can get the sword" and "can you get the sword". The first is either a statement or a polite order, the second is more of a question. If you are not strict with what you accept, you will find it much harder to make sense of it.

quote:
A recent ''history'' is also a must. It would be nice if not only could the NPC match the pronoun to the last spoken about object (with gender), within a short period of time ofcourse, but also if the NPC could also use pronouns in its responses.
May seem more realistic.

Store a little lookup table of LastItem (for ''it''), LastPerson (for ''he/she'') etc. Then generate the sentence as normal, and go through your table and post-process the output, substituting in the relevant pronoun for whatever noun was there if possible.

#13 GeniX   Members   -  Reputation: 114

Like
Likes
Like

Posted 12 April 2000 - 11:04 PM

Hmm..Thanks ppl - food for thought.

Just a comment tho on an earlier reply (too lazy to look who said it), but all of this may be overkill.

However, if it were possible to develop a fairly robust text parser which allowed for slight grammatical errors/spelling mistakes then it could be "imported" to almost any developing RPG style game.

Altho, admittedly, we seem to see very few type-your-own-sentace-in RPGs nowadays. Most movement is controlled by clicks, and actions too.
Conversations are pre-written dialog''s in which the player chooses from a list what they wish to say.

None-the-less reconstruction of sentances for NPC''s to say could be a very interesting area to investigate. Would make for much more realistic NPC''s in any game.



regards,

GeniX

#14 MikeD   Members   -  Reputation: 158

Like
Likes
Like

Posted 13 April 2000 - 01:13 AM

quote:
--------------------------------------------------------------------------------
Also, it would be useful to have many different examples of each phrase type, from
"I would like a sword"
to
"Ug want sword"
--------------------------------------------------------------------------------


Every extra phrase type supported makes it more difficult to accurately guess which type of word comes next, as you don''t know what phrase type it is before analysing it. Therefore this reduces the chances of an accurate interpretation rather than increasing it.

-------------------------------------------------------------------

"I want a sword"
"I would like a sword"
"I want to buy a sword"
"I would like to buy a sword"
"Ug want sword"

If you want reasonable syntactic parsing you''re going to have to deal with the first four anyway. The fifth one is no different just a different construction of verb and noun phrase to compose a sentence. It''s not a whole different grammar just an extension of the current system.

I agree with your other points however.
The overhead for calculating mis-spellings would be overkill but, perhaps, fun to try and certainly not impossible or even too CPU intensive. It would just take time to write.
Your point about language generation, that "The NPCs now have to be able to construct meaningful questions and phrase them properly. " is fair as well. However you could use templates for simple question asking and there would only be certain places where you would ask a generated question. Time consuming again but not impossible or particularly CPU expensive.

Mike


#15 Kylotan   Moderators   -  Reputation: 3329

Like
Likes
Like

Posted 13 April 2000 - 03:30 AM

quote:
Original post by MikeD

"I want a sword"
"I would like a sword"
"I want to buy a sword"
"I would like to buy a sword"
"Ug want sword"

If you want reasonable syntactic parsing you''re going to have to deal with the first four anyway. The fifth one is no different just a different construction of verb and noun phrase to compose a sentence. It''s not a whole different grammar just an extension of the current system.


Still, the system you describe above is already based on assumptions rather than literal intepretations.
"I want a sword" can be more literally translated as "I desire a sword". It is not the same as "give me a sword" or "sell me a sword". All your above phrases are statements, and parsing statements is fraught with assumptions. That is one reason why many games stick with just orders, starting with the imperative verb, as then you nearly always know exactly what is needed.

As for spelling checking... a exact check for string equality is quick and will almost always return after the first character in both strings are compared, whereas checking strings for almost-equality will most likely have to go through the entirety of every string. This will introduce a substantial relative performance hit - although whether this increases your CPU usage from 10% to 40% or 0.00001% to 0.00004% will depend on just how good your computer and other algorithms are

Not to mention that "ug want sword" could be "I want sword" or "you want sword" (assuming the question mark was omitted) which adds another decision that the parser has to make.

#16 MikeD   Members   -  Reputation: 158

Like
Likes
Like

Posted 13 April 2000 - 03:53 AM

The fact is, and I think we both agree here Kylotan, that it is possible to do all the things we talked about. Whether it is feasible depends on the implementation and programming skills of the individual carrying out the work.
My third year project was in Natural Language Generation and I know a fair bit about AI (pro-programmer), so I''m sure, given enough time, I could pull it off. For a beginner though, it''s rightly correct to take one step at a time. Forget about spelling, forget about generation, take parsing, make it simple and see what you can do.
If you can''t find the info you need here or at www.gameai.com then I suggest going to Amazon and buying the best sounding book there. In the end that''s all you need.

Mike

#17 felonius   Members   -  Reputation: 122

Like
Likes
Like

Posted 13 April 2000 - 06:07 AM

As part of my Computer Science education (still in progress)I took a Linguistics "sidesubject" to B.A. level.

The primary result that I learned after having made many (attempts) Natural language parsers (NLP) with semantics is that it very hard to do properly.

Every time I tested it the users tried to use sentence constructions I had not thought of your tried to use words not supported. There is so many things to take care of. Such as when the user refers to "the apple" and to apples are present. Which one is being referred to?
I have come to the conclusion that we in computers and in games should stay away from NLP and use some substitute. Is is not without reason that no adeventure game is text based any more.
Why not use som alternate way of representing meaning such as visually building sentences as in the old Maniac Mansion or Zak McKracken games: Click the verb - click the object and so on.
Another alternative is to let the user select between different predefined sentences with empty parts that can be filled with parameter values (also selected from a menu).

Good luck in your attempt.

Motto: Better have low ambition and do it well, than high ambition and a flawed result.


B.Sc. Jacob Marner
Graduate Student of Computer Science, The University of Copenhagen, Denmark.
http://fp.image.dk/fpelisjac/rolemaker/




#18 crazy166   Members   -  Reputation: 122

Like
Likes
Like

Posted 13 April 2000 - 06:42 AM

i predict that this will be an even bigger thing once voice recognition and text-to-speech becomes mainstream in games.

it would definitely be worthwhile to come up with something like this that could be plugged into any game (just add voice and text2speech later), and i have a feeling there will be many third-party "plugins" available to do this type of thing, once it''s been around for a while.

in fact, microsoft will probably implement a similar system to parse commands sent to Windows 2075.
"Please open the CD-ROM folder without crashing."



crazy166
some people think i'm crazy, some people know it

#19 kill   Members   -  Reputation: 146

Like
Likes
Like

Posted 13 April 2000 - 06:57 AM

Go to http://www.botspot.com
Look up chatter-bots. There will be one called Alice. It''s open source, so you can download the code, it uses XML to store data, and let me tell ya, it''s capable of doing some amazing stuff.

#20 Kylotan   Moderators   -  Reputation: 3329

Like
Likes
Like

Posted 13 April 2000 - 11:28 PM

quote:
Original post by MikeD

The fact is, and I think we both agree here Kylotan, that it is possible to do all the things we talked about.


Given enough programmer time, CPU time, and resources, you can achieve anything I guess what I am referring to is that you get diminishing returns - for every slightly more obscure use of the language you attempt to accommodate, you have to do a lot of extra processing not only to accept the new form, but distinguish between the new form and the old, more regular forms. As Felonius said, it''s very hard to do well. For most game projects, it is not unreasonable to expect some ''client-side parsing'' (ie. spell it properly, damn user! ) to save 75% of development time and 50% of computer resources.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS