Jump to content

  • Log In with Google      Sign In   
  • Create Account


Conversational AI


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
48 replies to this topic

#21 phantomus   Members   -  Reputation: 593

Like
Likes
Like

Posted 12 March 2004 - 08:52 PM

Here's a sketch of the memory class definition:

class CI
{
public:
enum
{
NOUN = 1,
PERSON,
COMPANY,
CITY,
COUNTRY
};
protected:
char* m_Name;
int m_Type;
Link** m_Link;
int m_NrLinks;
int m_Credibility;
int m_Interest;
Date m_Creation;
char* m_Source;
};


I believe this can store pretty much anything you might want to know about a piece of information at a later time. Some notes: Obviously, for names I could simply link all names to an CI labeled 'name', but that might be a bit impractical, so I think it's more efficient to store the type of common things like persons, companies etc in the m_Type field for fast lookup. Maybe it's too early for optimizations and special cases, but I couldn't resist.

Source is a string describing the source of information. This can be an IP address for merged data, a user name for locally acquired data, or a http link for online gathered info.

Here's a sketch of the Link class:

class Link
{
public:
enum
{
IS = 1,
HAS,
WANTS,
NEEDS,
KNOWS,
IS_RELATED_TO,
BIDIRECTIONAL = 1024
};
protected:
char* m_Description;
CI* m_CI;
int m_Type;
int m_Strength;
Date m_Creation;
};


The link can be made bidirectional by OR'ing in the 'BIDIRECTIONAL' flag. There will be probably more types needed, but there's always the generic relation 'IS_RELATED_TO'. The m_Description field is used for 'why' questions, but it will often be NULL. It can be filled when the user enters something like: 'The monkey eats a nut because he is hungry', 'because he is hungry' would be stored in the unidirectional 'IS_RELATED_TO' link between monkey and nut. If the user would just have said 'The monkey eats a nut', the same information could have been acquired using a 'why?' question.

I'm still brainstorming, so if you have suggestions, let me know.

O and Marco, I stole your CI thing.

- Jacco.


[edited by - phantomus on March 13, 2004 3:56:53 AM]

Sponsor:

#22 frostburn   Members   -  Reputation: 380

Like
Likes
Like

Posted 12 March 2004 - 09:30 PM

It seems to be turning into something like BrainHat. Check it out if you haven't. You might get more ideas. Remember to let it have cause and effect relations between concepts as well.

U: "Water freezes when it's cold" -> Cause:"Cold", Effect:"Freeze".
U: "Ice is frozen water"*1
U: "There's ice on the water"
B: "Oh, it's cold!"

See BrainHat for a VERY impressive example.

EDIT: *1 Forgot an assertion(?)

[edited by - frostburn on March 13, 2004 4:33:11 AM]

#23 phantomus   Members   -  Reputation: 593

Like
Likes
Like

Posted 13 March 2004 - 12:01 AM

Yes, that''s definitely very cool. Much better than what I''ve seen before. Initially, I was wondering why this didn''t win any Loebner contests, but that''s only logical: The thing reasons very well but the generated text is ''weird''.

The cause and effect stuff is very cool, I have to think a bit about that.

One thing I don''t understand: Why does BrainHat limit itself by demanding hierarchic structures?

And one other thing that BrainHat doesn''t seem to be able to cope with: Time. The plans I have so far can''t handle time either, but in our human brains, it mixes fine with all the other stuff. I wonder how one would handle that. Information like: ''Yesterday I saw the nice lady in church''. It would be nice if the program would be able to link time (yesterday) and the weekly interval (church) to a person (the nice lady), but I have no idea how that could be done.

#24 phantomus   Members   -  Reputation: 593

Like
Likes
Like

Posted 13 March 2004 - 12:03 AM

Hmm, this is exactly what I had in mind. Looks like someone was faster.

#25 kordova   Members   -  Reputation: 138

Like
Likes
Like

Posted 13 March 2004 - 05:58 AM

This is an interesting thread. I''m currently involved in a project like this, which I''ll probably write more about and leave a link in a day or two. Nice work!

#26 phantomus   Members   -  Reputation: 593

Like
Likes
Like

Posted 13 March 2004 - 07:24 AM

Yes well I just got severely demotivated by frostburn. This BrainHat stuff looks extremely familiar. I even had the link to Dragon Naturally Speaking in mind.

After reading their site in detail, there''s enough reason to start a new project though. 1: They didn''t seem to be able to turn the technology into a ''real'' application. Check out this page. Why not? What went wrong? There''s a lot of ''could'', ''might'' and probably on those pages. 2: The software appears to depend on manual entry of knowledge. It''s ''shortterm'' memory can apparently be fed using natural language, and it''s entire memory can be queried using NL, but it still needs expert intervention for adding new concepts.

I think the way they handle NL input is awesome though.

That being said, I will be focussing on two separate applications for a while:

1. The data pool, or ''long term memory''; I have already sketched the data structure, but I think it needs more work. Once I''m satisfied with the data structure, I''ll build some code, add some concepts and links manually, and then I will add a set of functions for querying the data in a NL independent way. The pool itself should allow queries like ''Is a cat an animal'', ''what kinds of animals do you know'', ''why is a cat an animal'', ''what is the relation between dog and cat'' and so on, plus commands like ''a cat is an animal'' and so on. At this level, this will be a C++ interface.

2. The NL processor for interfacing with the data pool. I want to build a really small set of patterns, so that I can do a 100% POS tag. Once I can do all queries and all operations on the data set using plain english, I will start making it more generic, up to the point that it can handle (almost) all normal conversations. From there, it should be possible to let the software acquire data from other sources than only thrusted humans typing nice sentences. The second part of the NL processor is the part that turns the output of the data pool API functions into natural language. This is relatively easy.

I think both applications are doable, and once complete, I should be able to do conversations very much like the examples on the BrainHat page, wich would be very cool.

By the way, a couple of years ago, when I was coding on an MSX system (OK, decades ago ), I borrowed a book from the library that had an AI program that could reason. You could actually feed it data using NL: ''a cat is an animal'', ''animals are living creatures'', ''is a cat a living creature?'' and the program would answer ''yes'', ''no'' or ''don''t know''. VERY cool.


#27 phantomus   Members   -  Reputation: 593

Like
Likes
Like

Posted 13 March 2004 - 07:38 AM

A manual for the natural language processor that I mentioned is now available at the following url:

www.bik5.com/files/manual.doc

For the impatient: The parser matches user input (or other text) against complex expressions. Here are some examples:

<< my name is *.
matches: 'my name is john.' and 'my name is asdsf.' etc.

<< my (first )name is john.
matches: 'my name is john.' and 'my first name is john.'.

<< my name is {john|dan}.
matches: 'my name is john.' and 'my name is dan.'.

Multiple wildcards can be used:

<< * is a *(.)
matches: 'A dog is an animal' and 'a cat is an animal'.

Curly brackets and round brackets can be nested without restrictions.

You can also use variables:

<< my name is [username].
Matches 'my name is jacco', but offly if the variable [username] contains 'jacco'.

The templates for generating answers work in the same manner, offly this time round and curly brackets are used to generate random answers: Text between round brackets is omitted in 50% of the cases, and text between curly brackets is randomly selected. Nesting can also be used as usual. In templates, you can fill the variables that I just mentioned, for example:

>> [username] = [#star0]
fills the variable 'username' with the text that the user entered instead of '*'.

There is much more, like recursive processing, #if/#else/#endif, numeric evaluation and external command execution, if you want to know the details please check the manual at the link I just mentioned (but skip the chapter on database interfacing, that part is gone in the current version).

The software is completely stable (Boundschecker reports zero problems / zero warnings so it gotta be good) and as far as I am concerned, it can be GPL'ed.

(why is the bboard software exchanging 'o.n.l.y.' for 'o.f.f.l.y.' all the time? humor?)

[edited by - phantomus on March 13, 2004 2:40:00 PM]

[edited by - phantomus on March 13, 2004 2:41:13 PM]

#28 frostburn   Members   -  Reputation: 380

Like
Likes
Like

Posted 13 March 2004 - 09:16 AM

quote:
Original post by phantomus
Yes well I just got severely demotivated by frostburn. This BrainHat stuff looks extremely familiar. I even had the link to Dragon Naturally Speaking in mind.



Hups.. Sorry, didn''t mean to do that . BrainHat inspired me to do some thinking into conversational AI, and I hoped it would do the same for you. Don''t give up on your idea. BrainHat seems very cool and all, but things can still be better.


#29 geoffsulcer   Members   -  Reputation: 122

Like
Likes
Like

Posted 15 March 2004 - 01:00 AM

Jacco,

The manual is very interesting reading. Do you plan to GPL the source soon or release a Windows binary to play with?

I was thinking about writing a simple text converter from AIML to NPL so that I could bring over a bunch of the Alice patterns to create a basic chat bot quickly.

Geoff

It''''s a simple choice, really. Get busy livin'''' or get busy dyin''''.

#30 phantomus   Members   -  Reputation: 593

Like
Likes
Like

Posted 15 March 2004 - 03:35 AM

Geoff,

I will release the source code as GPL. I will start on it right away.

About the AIML->NLP converter: Should be relatively easy, but be aware that the performance will be quite a bit worse. There are no optimizations in place, so each input sentence will be tested against a lot of patterns by a pattern matcher that is much more complex than the one used by WinAlice.

On the other hand, the converted code would never use multiple wildcards and nested stuff so it might be less painfull than I just suggested.

- Jacco.

#31 phantomus   Members   -  Reputation: 593

Like
Likes
Like

Posted 15 March 2004 - 04:00 AM

It''s available. Click here:

www.bik5.com/nlp_gpl.zip

I have added some quick notes everywhere about the code being GPL''ed; I''ll try to do it better next time (I suppose I have to add a copy of the GPL to the package).

The project files should be path independent, and loads in VC6.

The parser comes with a small test application (in main.cpp), wich reads nlp code from ''nlpfiles/core.txt'' (wich in turn includes some other files). There''s some sample code in ''tools.txt'', ''eliza.txt'' and so on. The spellchecker is using the dictionary in ''nlpfiles/data/dictionary.txt''. Logs are kept in ''nlpfiles/data/log.txt''.

Please be aware that error checking is sparse. It will respond properly to most common syntax errors, but the message might not be very clear so it could be a bit of a hassle to debug new code.

The stuff that I''m working on right now is also included, but highly unfinished. It''s the code in the ci source files.

Have fun,

- Jacco.

#32 Jotaf   Members   -  Reputation: 280

Like
Likes
Like

Posted 15 March 2004 - 07:46 AM

Hey, I''d just like to throw a simple idea that I think you overlooked: the human brain is capable of using much more flexible links than just "a is b" or whatever. This is evidenced by the huge ammount of neurotransmitters that all have different results when they are fed to a single neuron, last time I checked there were about 60 and counting. They''re all hard-coded, so imagine that we could develop an artificial brain that wasn''t restricted by these bounds... terrifying isn''t it? =)

Stuff like "this apple was green -yesterday-" and "this apple is red -today-" should be modeled in some way (sorry I can''t think of a better example but you get the picture, sometimes there''s the same link but with different "conditions" or whatever). I''m not sure if this is related to what I described above, but it''s obvious that our brain has no problem dealing with this. Ok now for a practical solution... I''m not sure if this is the best way, but a possibility could be to make links a bit more flexible, not only relating A to B, but also possibly with C and D in different ways, like conditions or something (duh). Like, you could append a different link to the old "A is B" link, something like "at C", which could be a time so it would know that this happens at the time or goes with whatever event. I discussed this and some other related stuff in another thread that was kinda overlooked, probably cuz it was posted in the wrong forum =P

Here it is:
http://www.gamedev.net/community/forums/topic.asp?topic_id=211811

#33 geoffsulcer   Members   -  Reputation: 122

Like
Likes
Like

Posted 16 March 2004 - 12:04 AM

I had another thought last night while playing with NLP.

When the AI asks a question, it should expect a response and be able to match the next input to a pattern based on the questions asked.

AI: What is your name?
Human: Geoff

This would match the pattern "My name is *" To which the AI would respond:

AI: Hello, Geoff. How are you.
Human: Fine

This matches the pattern "I {am|feel} *"

This can be expanded then to the more likely scenario where we have confused the AI by telling it something it doesn''t understand.

Human: I like apples

Nothing matches, so the AI responds with the default:

AI: I see, tell me more.
Human: They are sweet and juicy.

In this instance, "They" means "apples" from the previous topic. And we can now make the association that apples are sweet and juicy. Again, the AI may not have an appropriate response. If it asks another question like "tell me more" or "are all apples sweet and juicy" then it should assume the response is about apples even if the noun isn''t used. On the other hand, if it responds with a statement like "I like apples too" it might assume the next human input is about apples, but only some percentage of time, or if apples are explicitly mentioned.

Geoff

#34 phantomus   Members   -  Reputation: 593

Like
Likes
Like

Posted 16 March 2004 - 12:29 AM

Geoff,

I think something like that would only build a better Alice, not a real bot. Anyway, what you described can be done: When executing a template, you could fill a variable with a string that represents the ''expected response''. For the next answer, you could first check the ''expected response'' variable to handle expected answers, or revert to normal chatter logic if there''s no match.

Example:

<< (hey )how are you( doing)( today)(?)
>> Fine! What''s your name? [expect]="my name is *"

<< my name is *
>> #if ([expect]="my name is *")
[username]=[#star0]
#else
why do you tell me your name?
#endif

<< *
#if ([expect]="my name is *")
[username]=[#star0]
#else
{Really?|Interesting!|Go off.|Tell me more.}
#endif


I actually did it in two ways here: The first matches only sentences starting with ''my name is'', and responds differently if this was expected or not.

The other solution ''overloads'' the default pattern ''*'', so that if the user types a single word, AND a name was expected, that single word is interpreted as the user''s name.

I believe you can do virtually anything with NLP, by the way. I do wonder however what would happen if you think up multiple constructs that use the default * pattern, as those ''algorithms'' would become mixed quite soon.

Do note the cool handling of the typical ''Eliza'' ''I have no clue'' response... So simple.

- Jacco.

#35 Anonymous Poster_Anonymous Poster_*   Guests   -  Reputation:

Likes

Posted 16 March 2004 - 12:15 PM

I didn´t have time to reply this post as I wont, but it´s very rich.
For a moment I think that I could re-open my project and/or help someone to finish some interesting work.
BUT, brainhat appears... well, I won´t say that the brainhat is the perfect software that I ever dreammed, but it is in the right way. SO, if I can make some impressive software, I will need to work lot more. Or, what is more important, THINK more, and better, and faster. I will collect all infos posted here and study, in my free time (that is the problem, I almost have no free time to work in this beautiful things... my "real" job takes me all the time).
By the way, is there some way of chatting with brainhat in the net ?
Jacco, good luck on your project.
Marco.



#36 brainhat   Members   -  Reputation: 122

Like
Likes
Like

Posted 05 May 2004 - 11:25 AM

Hello Jacco (and everyone). I was very glad to have stumbled on this conversation (though it appears to have gone a little cold at this point...). My name is Kevin Dowd. I wrote Brainhat.

Brainhat was my attempt to create a natural language environment using classical methods. The web site has a short description of how the code works. You are also welcome to a copy of my first five years of (http://www.brainhat.com/dowd2.doc) notes. This will explain why I did what I did. We''ve stopped trying to pursue Brainhat commercially. The source code is available on the site. The documentation is lacking, sorry to say. There''s so much left to do....

In the grand scheme of things, I saw Brainhat as being just one part of a three-tiered solution for dialog:

Bots--particularly AIML bots--seemed like a good solution for the lowest level. They are efficient. They''re good preprocessors for idomatic use of language. They can provide responses efficiently. They can track changes in context. They''re also good as output filters...

At the second level, I envisioned packages like Brainhat with some kind of inference engine to make the dialog goal-oriented and to make the dialog engine track changes in conversation flow. It is tempting to think that one could motivate dialog by slot-filling (e.g. shoes <- want color), but I my early experiments with this is suggested that an engine tended to ask a lot of noisy questions. Imagine telling a dialog engine "My shoes don''t fit me," and getting the question back "what color are the shoes?"

At the highest level, one would want an engine that recognized templates of narrative and temporality. This kind of thing would make it possible to understand a story or follow a recipe (for a robot!). Narrative theory is an established discipline, but I just haven''t found the time to see how it could be incorporated.

Anyway, I can offer a few components...
Let me know how I can help.

-Kevin


#37 Nice Coder   Members   -  Reputation: 366

Like
Likes
Like

Posted 08 May 2004 - 01:58 AM

I remember building quite a few chatbots (mostly probabilistic IO, some of the more complex stuff), and would like to share with you a method that has served me well (I am building myself and answerbot, so this may or may not be what you need).

Synonym conversion
Converts synonym to the base object eg. it just replaces synonims with base words.

Contration conversion
Converts Contractions (and appostraphies) into there unabridged versions. Eg. Momo''s Purple -> Momo is Purple.
It''s Good -> It is Good

Human: "Bovines are Purple"
Do synonym conversion
inp: "Cows are Purple"
dissasemble sentence
object -> "Cows"
Compto -> "Purple"
lookup purple
Purple is a value of the property "Colour"
lookup cows
cows is a category, so if possible update the inheritence.
Cows has the Property Colour
Update Cows property called Colour
Update all other objects of catagory "Cow", unless the user explicitly stated another value.

[after adding]
human: "Momo is a Bovine"
Do synonym conversion
Inp: "Momo is a Cow"
momo - unknown object
cow - known object
Create new object of name "Momo", and of catagory "Cow"
Inherit all properties from base object "Cow" to "Momo"

[Wanted Transcript]
Human: Momo is a bovine
(added new object ''Momo'')
Bot: Really, is that so?
Human: Momo''s Purple
(apostraphe-sp? Detected, doing conversion)
(inp: momo is Purple)
(Object Colour Property Updated to Purple)
(Objects Colour Property Flag set to 2^1
Bot: Please tell me some more about Momo
Human: Bovine''s are Red
(apostraphe-sp? Detected, doing conversion)
(synonym conversion)
(we have a problem here, how do we determine that we need "Cows are red" instead of "Cow is red"?? Unless Cow is a base object, which cannot used with is, only are?)
(Perhaps have cow a synonym of cows?)
(getting back... inp: Cows are red)
(red is of type Colour)
(Cows has property Colour)
(cows Property has been changed to Red)
(update all objects cast from cows, unless stated)
(check flag and nochangable(2^1) = nochangaeble(2^1) if so then don''t change the property)
human: "Is Momo Red?"
(Object Momo)
(Red is of Type Colour)
(find Momo colour Property)
Bot: No, Momo is Purple.
[/transcript]

This is Basically a Pattern matcher, triggeres a script, scripts''s output is given into some sort of responce engine, which is given to the user.

For effective pattern matching maybe something like:
(is)(space)(isobj)(space)(isobj->isprop);
if (Checkprop(isobj->isprop->pval)) {
out = "Yes, " & isobj & "''s " & (isobj->isprop) & " Is " & " (isobj->isprop->pval)}
else {
out = "No, " & isobj & "''s " & (isobj->isprop) & " Is not " & " (isobj->isprop->pval)}
}

Would make for easier scripting/coding of heigh level responces.

Hopefully this was helpful to me/you/person standing next to you/boss/ISP''s Profit/(insert term here).

DE NC

#38 phantomus   Members   -  Reputation: 593

Like
Likes
Like

Posted 09 May 2004 - 11:19 PM

Hm, this thread was pretty dead indeed. I couldn''t even find it anymore.

Anyway, I''m still working on this AI stuff, so I''ll gladly join the discussion again.

NC: What you describe is basically what I have now. The application is set up as a standard chatbot, i.e. the program responds to sentences entered by the user. Each sentence is first ''cleaned''; [isn''t] is replaced by [is not], common typos are fixed, slang is translated and a spellchecker is used to improve things further. At this stage, the user input is supposed to be converted to nice and common english.

After that, I pass the sentence to a pattern matcher, similar to AIML. Right now, the patterns are all aimed at simlifying the english sentences to three possible ''commands'': [letaisb], wich links two things (a cow is an animal, a cow eats grass, a train is yellow and so on), [letaisnotb], wich does the same but negates the link, and [askaisb] wich answers questions about relations.

These three commands are passed to a C++ object named ''CIPool'', wich is basically a representation of a mindmap. The CIPool handles data organization and query processing. It sends answers back, wich are again translated into nice english by the pattern matcher.

So, here the pattern matcher is a bridge between normal C code and the user, nothing more. The C code provides some ''intelligence''.

Right now I''m trying to make this more generic, so it can extract information from more diverse sentences. For example, I''m adding support for ''MY cat has four legs'' right now, wich links the user to the cat using the ''owns'' relation. The current structure of the language parser allows this in a quite intuitive way, so I''m quite happy with the results so far.

Kevin: What you accomplished so far is already awesome. I''ll study Brainhat once more. But Brainhat.com seems to be down at the moment?

- Jacco.

#39 phantomus   Members   -  Reputation: 593

Like
Likes
Like

Posted 09 May 2004 - 11:28 PM

BTW, I found that when you have tons of patterns all leading to just three commands, it''s already possible to ''fool'' the pattern matcher. So I''m trying to make the patterns as precice as possible.

Example:

>> a(n) * is a(n) *(.)

matches: An cow is an animal.

>> all *s are *s(.)

matches: All cows are animals. Plus, by using the text that was replaced by the asterix, you get perfect parameters for the ''letaisb'' command: ''cow'' and ''animal''.

Obviously, this causes tons of problems:

''all sheep are animals''

will not match...

So I worked over the weekend on decent plural / singular detection and conversion. I got it working for 99.5% of the cases now, but I found that English is far less straightforward than I thought. Some nasty plurals: ''tomato -> tomatoes'', ''half -> halves'', ''formula -> formulae'', ''sheep -> sheep'', ''foot -> feet'', ''mouse -> mice''... Anyway, it works now, and I even coded it in ''NLP'' (my home-brew pattern matching language).

#40 Jabberwacky   Members   -  Reputation: 122

Like
Likes
Like

Posted 10 May 2004 - 05:36 AM

All very interesting. My AI at www.jabberwacky.com is learning in much the way that was said to be unproductive earlier in the thread - that is, openly, free to all.

Judge for yourself whether it is corrupted by its input.

It has no real purpose in life other than entertainment, but that seems to be enough.

It does, in effect, learn by means of questions asked of its users. Later users answer the questions of asked by earlier ones, and still still later users see the results. But it works equally-well for statements.

Rollo


[edited by - jabberwacky on May 10, 2004 12:37:31 PM]

[edited by - jabberwacky on May 10, 2004 12:38:24 PM]

[edited by - jabberwacky on May 10, 2004 1:02:26 PM]




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS