How would I program grammar rules?

Started by
14 comments, last by Tutorial Doctor 10 years ago

So, I am working on my Senses System (see my journal) and I temporarily typed the following code(LUA):

function TriggerEmotionChange()
    if See(box) then
        mood = 10
        setText(state, getName(box) .. "es " .. "make me glad")
    elseif See(monkey) then
        mood = 3
        setText(state, getName(monkey) .. "s " .. "make me sad")
    elseif See(building) then
        mood = 5
        setText(state, getName(building) .. "s " .. "make me turn around in circles")
    end
end

This made me want to create a grammar system. I know I need to do some operations on strings, namely checking whether the last letter in a string is a certain letter and then adding the appropriate ending. I know I would do a few if/else statements and then one else statement to check if all other letters of the alphabet.

Just in case someone wanted to know what the code does:

In the example above, the getName() function gets the name of the game object and returns a string.

The "state" variable is just a text game object.

The value of the mood value determines the emotional state of the character.

Anyone have any tips on how I might go about the grammar system (shouldn't take long to implement it) for "ing" and "es or s" endings?

Just a little something to get me started on how I should approach it in an efficient way?

They call me the Tutorial Doctor.

Advertisement

Are you interested in creating your grammar rules in Lua, or can it be in C++? If the latter, then why not just use an existing parser generator like Boost::Spirit? Then you can have true support for your own domain specific language without doing the low level string manipulations.

As far as I can tell, this isn't really a parsing problem (I can't find the Senses System you mentioned, either). From what I can see in this post, you're trying to pick the correct form of a word. What you need is a lexicon with words and what their other form(s) are. If you need to do more than just take a word with the form it needs to be converted to, then you need something else entirely which composes sentences from a grammar, not a parser (which uses the same grammar, but in reverse).

Taking the word endings will work for the regular cases, but English has a LOT of exceptions with commonly used words:

One fish -> Many fish (fishes sometimes acceptable based on context)
One goose -> Many geese

Verbs can be even worse:

I see it
-> I saw it
-> I have seen it


If you want a really comprehensive generative grammar system, you should look at http://en.wikipedia.org/wiki/Head-driven_phrase_structure_grammar to see if it fits your needs.

If you need a DFA generator, you can try Wombat, It lets you 'paint' the automata using a C++ based Domain Embedded Specific Language.

What you need is a lexicon with words and what their other form(s) are.

^ That is how I would do it.

For any string that you want to do this to you could store its singular and plural forms (e.g. monkey stores its name in singular and plural form). Then you can just call getNamePluralised(monkey) to return the plural form.

Some examples:

(box, boxes)

(monkey, monkeys)

(building, buildings)

(sheep, sheep)

(ability, abilities)

(bacterium, bacteria)

(cactus, cacti)

Clearly no amount of "ing", "es, "s" manipulation is going to take you from cactus to cacti.

This kind of text processing is IMHO best done by text templates with well defined mechanism for text substitutions. There is an article on gamasutra about localizing MMORPGs that explains such an approach to some detail. Although not exactly fitting the OP's problem at first, such processing is better suited than string concatenation as shown in the OP.

In the end the plural problem is solved fine in a way similar to what dmatter suggests. Together with templates it may look like (boxes

,box), perhaps with a default rule of simply appending "s" if no

tagged form is found.

Hmm, good responses. It looks like I could use a table that stores all endings and somehow go from there. I'll try to figure this out ASAP so I can post the results on this same topic, in case someone else later comes along and wants to do the same thing.

I could have the See() function take two arguments, one would be the object being seen, and the latter could be the ending:


See(box,es)

Look(boy,s)

Start(walk,ing)

Indeed going from cactus to cacti would not work in this case. This could be revised somehow. I could do the base of the word like:


See(cact,us)
See(cact,i)

Then I could check the ending and run either a single collision test or test multiple collisions (see multiple things).


function Smell(object,ending)
    if isCollisionBetween(Player,object) then
        setText(txt_smelling, "I smell " .. getName(object) .. ending)
    end
end
    

In this case if I did:


Smell(box)

If I wanted it singular, I would have to have an "a" in there for "I smell a box."

Plural would be:


Smell(box,"es")

I can do a conditional statement based on the ending, but I don't know how I would change to the singular unless I do a conditional statement somewhere (don't know how it would look yet).

Seems I could set all endings explicitly first as variables:


s = "s"
es = "es"
ing = "ing"
us = "us"
i = "i"
-- gotta watch that one as "i" can be a counter too. But I could use "j" instead. 
 
ending = ing
 
print(ending)
--ing

They call me the Tutorial Doctor.

That might sorta work for cacti, but what about geese and G's? Just store the full word alternatives. Unless you want to teach your grammar to conjugate based on what language the word comes from, but that doesn't seem likely to be worthwile.

I like the way that Django (Python web framework) handles plurals. When you create a database table, you create it as a Python class, e.g. "class Reply(models.Model)". When displaying the reply table contents on the admin site, it guesses the plural by just adding an -s (so "replys"), however you change this by overriding it like this:


class Reply(models.Model):
	class Meta:
		verbose_name_plural = "Replies"; 

Android handles this by using quantity strings in resource files; the documentation has a good example of how this works.

I think for your code, what would work best is a system similar to Django's (assume -s unless overriden). Here's an example of these:


-- Create a class called object
local Object = { };


-- Name is string; pluralName is string or nil
function Object.new(name, pluralName)
	name = tostring(name);
	-- If no plural name was provided, assume it's just add -s       random ' to fix highlighting
	pluralName = pluralName or name.."s";

	local object = { };

	object.name = name;
	object.pluralName = pluralName;

	-- Requires the metatable for the class system to work
	setmetatable(object, {__index = Object});
	return object;
end


-- Methods
function Object:getName()
	return self.name;
end



function Object:getPluralName()
	return self.pluralName;
end



-- Dummy methods for testing
local function isCollisionBetween(objectA, objectB)
	return true;
end



local textSmelling = "SMELLING";
local function setText(type, text)
	print(type, text);
end



-- Using it:
local Player = { };

function Player:smell(object)
	if (isCollisionBetween(self, object)) then
		setText(textSmelling, "I smell "..object:getPluralName());
	end
end



local apple = Object.new("apple");
local cactus = Object.new("cactus", "cacti");

Player:smell(apple);
Player:smell(cactus);

I tested it on the Lua online demo and got:


SMELLING	I smell apples
SMELLING	I smell cacti

Falling block colour flood game thing I'm making: http://jsfiddle/dr01d3k4/JHnCV/

I see the simple kind of text synthesis falling short, even if the plural problem is solved.

E.g.: Notice that there is a difference between writing proper names or general thing names, because the article is left out with proper names:

"I see Jack." (not: "I see the Jack.")

"I see the boxes."

With text patterns, this can be expressed, because the name of object Jack is marked being a proper name (using tag n)

NAME ::= "Jack[n]"

while the name of a box is not (there is no tag n)

NAME ::="box"

so that the text pattern may look like (using the language from the cited article)

"I see {the[!n]} $NAME$."

Similarly, a name of an object may carry both its singular and its plural form, how ever it has to be written, like in

NAME ::= "geese

|goose"

I have not really understood what "See(Box,es)" is good for. It seems me being an action that may be performed on an object (possibly an object group). If so, it seems me wrong to provide the object's plural ending at this point. If the object is a single one, then its name being

NAME ::= "goose"

or, if it is a group, then its name being

NAME ::= "geese

"

is all that is needed. It is further possible to use the combined name as shown above, and have a cardinality attached to the object, where the cardinality value defines the form in the end.
BTW: The approach in the OP is hard-coded for now. As soon as you go data oriented (and you should do so), things need to be abstracted more anyway.

This topic is closed to new replies.

Advertisement