# Natural Language Expert System - Idea

This topic is 4432 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

What do you guys think of the following idea (Possible? Impossible? Soemthing that needs to be fixed?)?
Constructing a Natural Language Expert System

End result
Answers natural language questions in context using knowledge in the knowledge base
Associative memory knowledge base of real world objects and concepts
Automatic creation of lexicons for and learning of language structure, grammar, and syntax
Ability to formulate grammatically correct sentences

Components
Language learning
Pattern recognition
Determine the ordering of parts of speech
Determine the syntax for phrases
Determine part of speech and meaning from word suffixes
Learning never ends and always continues to evolve
Language dictionary
Includes all words and their possible meanings and parts of speech
Associative memory model
ART or similar neural network for storage of associations
Representation, storage, and retrieval of knowledge
Allows for contextual representation of knowledge and semantics
Short-term memory model
Associative memory model only provides a method for storage and retrieval of knowledge
Allows knowledge to be picked apart and key components to be extracted, then fed as inputs to the associative memory model
Natural language processing
Handles input and generates output using
Knowledge from the associative memory model
Learned language structure, grammar, and syntax
Parsing
Shortcomings of current systems
Unable to understand context
Unable to determine semantic meaning of words
Lexicons must be hard-coded
Possible uses
Devices that can perform actions based on natural language input
An application that answers questions encompassing all human knowledge
An application that reads a text and answers questions based on that text
Language translation (would require more subsystems)

##### Share on other sites
People have been working on this for decades. What exactly are you looking for in terms of answers?

##### Share on other sites
The problem of natrual language is much much harder than it seems at first.
Many very smart men attempted to solve it ever since computers were interactive and still attempt it these days. It will have huge market and will impact our society very much when its finally here, but I dont recomend attempting to solve it alone.

Natrual languages have so many duplicate meanings that even with great database of knowledge its very hard for computers to understand. Simple things such as refering to "it" can confuse humans (for a second) and confuse computers into ridiculus conclusions.

"Danny threw the coffee mug at the wall and it broke into pieces"
- what broke? the mug or the wall?
Note in a glass house the wall is also an option.

Iftah.

##### Share on other sites
I know, but here I'm bringing all the pieces together. What I'm asking you guys is whether you see any missing pieces or anything that needs to be changed.

##### Share on other sites
wellll, your "pieces" are very big projects and I dont know what you plan on doing with them. I havent given the problem of natrual language recognition any thought because its such a difficult problem, so I dont know how to even start at it.

Its like asking if the following is a good design for a spacecraft program.

spacecraft program:
1) take off
2) orbit
3) land on the moon
4) explore the moon
5) take off moon
6) land back on earth

yes its a start of a good design, but each part may be *very* hard to implement and its hard to know now (without deep thought) if there are missing parts.

Iftah.

##### Share on other sites
...and a simple design with a lot of ambition is all it takes to acheive something great. One thing at a time - piece by piece. It's possible, and it will be done.

Thanks!

##### Share on other sites
Go for it. Post back here when you're done. [smile]

##### Share on other sites
Quote:
 Original post by chadjohnson...and a simple design with a lot of ambition is all it takes to acheive something great. One thing at a time - piece by piece. It's possible, and it will be done.Thanks!

Please make a blog or developer journal on your progress or at least let us know what you got so far a year from now [smile]

Cheers and best of luck!
Pat.

##### Share on other sites
Haha. But seriously, we've merely scratched the surface with technology. There are so many things that are possible which are yet to be discovered and exploited. Some day, if we're still here, I think we won't be able to tell the difference between humans and computers. They'll be different (i.e., they won't have souls as we do), but we won't be able to tell.

##### Share on other sites
Quote:
 Original post by chadjohnson...and a simple design with a lot of ambition is all it takes to acheive something great. One thing at a time - piece by piece. It's possible, and it will be done.Thanks!

I'm not trying to rain on your parade here, but as the sole author of a fairly complicated system, I can attest firsthand to the difficulty of the topic you're tackling here. Each of the lines in your original post covers enough investigation and mathematical hand-waviness to fill a dozen research papers.

Metalyzer, a heuristic analysis system of my own devising, was originally going to use natural language recognition to answer questions about its ever-expanding knowledge base. I quickly realized my error, because a pure-NL program is a substantial task. Instead, I realized that with more clever contextual methods, I could approximate NL for a fraction of the computational complexity and cost. It's these alternate methods you should focus on, not NL.

Of course, if you think you have something revolutionary, none of the above applies. Go for it. [smile]

##### Share on other sites
I'm not going to steal your ideas - I'm just curious. What kind of contextual methods?

##### Share on other sites
it is possible to make a NL for a certain context (meaning relativly simple world/database)

for example:
- bus database: "at what time does the last bus from A to B leave?"
- world of shapes: "where is the center of the largest triangle?"

the above NL recognitions are still very difficult problems, but possible for one man to undertake. I recomend trying one of these first so you will see how hard the big problem is.

Iftah.

##### Share on other sites
I am interested in writing a paper in the area of natural language. If you would like to work on it with me, send me an e-mail.

##### Share on other sites
Quote:
 Original post by chadjohnson...and a simple design with a lot of ambition is all it takes to acheive something great. One thing at a time - piece by piece. It's possible, and it will be done.Thanks!

I think you missed his point.
You can't do it piece by piece, if each of your pieces are the size of, well, a very big project in itself. If you really want to do it one thing at a time, then take his advice, and split it up so you actually *have* one thing at a time listed.
You're not doing it one thing at a time, unless you flesh it out *a lot more*.
In other words, it takes a lot more than a simple design and a lot of ambition to achieve something great. It might be a good starting point, but it's going to have to be converted into a detailed design before you can "achieve something great" with it.

##### Share on other sites
Quote:
 You can't do it piece by piece, if each of your pieces are the size of, well, a very big project in itself. If you really want to do it one thing at a time, then take his advice, and split it up so you actually *have* one thing at a time listed.You're not doing it one thing at a time, unless you flesh it out *a lot more*.In other words, it takes a lot more than a simple design and a lot of ambition to achieve something great. It might be a good starting point, but it's going to have to be converted into a detailed design before you can "achieve something great" with it.

I am well aware of that. And it's very obvious that each piece of that outline is a considerable project in itself. That's what I mean when I say "piece by piece." Of course it would have to be split up. Like I said, that document just brings together the parts of a system - a very large and complex system. You would do the same for any large project - I'm just using the waterfall method. I do have ideas for each, and I've shown them to several of my professors.

If I'm able to get one piece working, that's an achievement in itself. And if I get one working, then I have one piece. Once I have that, I'd move onto another, and so on. I am aware that I likely could not finish such a project in my own lifetime - that's why it would either have to be done by more than one person, or my work would have to be passed on to another.

I've written several programs, and when I've shown them to people they didn't believe me that I made them. They told me I downloaded them from somewhere, but I did not.

If you want to say that it's unachievable just because it's composed of several large and complex pieces, then that's just as good as saying that Linux cannot exist, but it does.

##### Share on other sites
Quote:
 Original post by chadjohnsonIf you want to say that it's unachievable just because it's composed of several large and complex pieces, then that's just as good as saying that Linux cannot exist, but it does.

Interesting yet invalid anology. I guess what people are trying to tell you is, that it's not only a complex task (which you are fully aware of) but that parts of that task have been subject to intense research for decades. And the results are not as satisfying as even those who did this research have been expecting.

On the other hand Linux provided nothing revolutionary new. It just implemented well-known and established concepts. That's the difference the other posters pointed out. It's not as if the task is impossible to achieve, it's just that a lot of things on your list are still subject to basic research and far from being usable in that context (at least not to the extend you are shooting for).

I do believe that it is quite possible, though. Just not by single person in a forseeable amount of time, but I think we mistook you here as you already mentioned that others will be involved as well.

Best of luck and keep us informed on your progress,
Pat (who is really interested in this).

##### Share on other sites
Hey guys, thanks, I really appreciate the responses. Spoonbender, I wasn't lashing out at you or anything, so I hope you don't take it that way.

The examples given are great. I went and saw the movie FlightPlan tonight (which by the way is an excellent movie, and Jodie Foster...), and I had time to think before the movie. Here is what I came up with this example:
Quote:
 "Danny threw the coffee mug at the wall and it broke into pieces"- what broke? the mug or the wall?Note in a glass house the wall is also an option.

The coffee mug comes first in the sentences, so it seems more likely that that is what the sentence focuses on.	But, should presentation order determine likelihood?		Can you think of an example where it does not?The computer also could record/encode BOTH meanings. If it has any context available, it could weight them statistically (using a neural network).

One other thing. I would speculate that the system initially would look like it was not doing anything. If designed right, it would be like a child with its knowledge base empty at first. It could take a long time (a few years at least) to train the system on just the basics. It would have to understand what things are and how they are related to one another. You could use the dictionary to do this, but only when it is able to understand the meanings of the words in the dictionary. So you would have to start out with very basic words and very gradually get more and more advanced. So I think a good approach would be to study child/infant psychology and understand how they develop, and then model that as best possible.

And even if I'm not able to get anywhere with this, I do think this is how it would have to be done. If you want it to be like a human, you have to make it like a human. It doesn't have to be as complex, but it needs to model the same things.

Also, one thing I think you could try with automatically building the lexicon is to label the parts of speech for each word in the first set of sample sentences. What do you think?

##### Share on other sites
A lot of people (by which I mean a lot of people with doctoral degrees and decades of AI experience) have tried that approach to NLP. No major successes yet.

##### Share on other sites

     NO SHOOTING   SMALL CHILDREN      IN AREA

This is an actual sign from somewhere in the U.S.

Can I shoot large children in the area? Can I shoot small children outside the area? Can I do something else to the small children in the area? You and I know what the sign means -- there are small children in the area, which precipitates the admonishment to not shoot. A computer really wouldn't.

Here's another one (or two):

  Joe saw the mountain flying over Switzerland.  Joe saw the airplane flying over Switzerland.

You and I understand that in one case, Joe's in an airplane; in the other case, he's on the ground. A computer, currently, doesn't (in the general sense).

There's a large research project that has been going on forever, that just attempts to build a database of "common sense" knowledge, using special-purpose knowledge representation -- not anything near natural language. You can check it out at Cyc.com. They've been going for a long time, with a lot of people, trying to crack just a single of the many "nuts" in the list you presented.

Just how many people do you have, and for how long?

##### Share on other sites
Right now, nobody. I'm a college student - undergraduate (junior level). There is nobody at my school that I think would be interested in the subject (at least none of my friends; most of my friends hardly know how to program). But I'm extremely interested in this subject. I am not at the moment going to grad school, mainly because I can do a lot of the same research on my own, and I also don't have the money. And I'm not so good at mathematics (e.g., deriving equations and heavy calculus), which you have to be, but I can do it when I need to and have time. If someone else could do the math when necessary, if it's not already worked out, then that'd be great.

I have (as far as I can tell) really good programming and organization skills, and I'm good with structure. I know how to structure a program well, and I'm also really good at designing databases.

So I don't have any kind of research grant, but if I actually get somewhere I'll eventually start a company or an LLC. If I do, I will have 2 divisions - one that does consulting (to make money in the meantime), and another that works on the real goal (the system or a subset of the system I described). I wouldn't at first want to rely on a grant because it would take a while to start making any substantial profit. Later on I would though.

I suppose though, that if I actually got anywhere I could start a SourceForge project and accumulate team members, and then I would have people for the company or whatever.

So I guess I am looking for a few people that I could continually share ideas with via e-mail or a web site where we (securely) post ideas we have. I'll first focus most on natural language processing. So if anyone is sincerely interested, then let me know and we'll get started. It'll be on our own time, and we'll all sign something (official) that says we get credit. My brother is also a linguist (who went to Moody) who can read, write, and speak several different languages.

##### Share on other sites
Well, if you want a head start on the representation of common knowledge part, there's an open source version of Cyc called OpenCyc, which you can download from their site. Good luck!

##### Share on other sites
Why can't most people draw very well? I mean we see people all the time, but it takes someone many years to be able to draw others and produce results that approach realism. We think we notice details, but in reality we perceive very little of the world, our experience has to be truncated so we can avoid becoming a nervous mess. This is the fundamental problem of asking people to input everything they know into a computer, they don't know that much. A good portion of the human psyche is instinctual and therefore inaccessible. Look at language in general, it's very clumsy and very illogical, without a common experience to check against very little of what we say would make any sense at all. Japanese has this in spades where everything said is vague and can be interpreted many different ways, and English with it's many double entendres can also be pretty formidable. We talk in references to other things, and those things arise from personal experience, without that human experience I don't believe a computer will ever understand human speech beyond a superficial level, and thus will always be prone to error.

I would say limit the scope of the problem, like say parsing speech in the context of the medical profession, or for banking transactions. If you limit the AI to contexts are are highly technical yet concrete you should get good results. Basically anywhere where people talk about "things" and not people or themselves would be a good idea.

Though I have to admit that writing a system to understand language in something like a dating sim or a "psychologist-bot" would be pretty cool just from the perspective of the technical challenge involved and the philosophical implications of success. Anyway, I'm not trying to dissuade you from writing it, I'm just saying that you have some problems to overcome with the implementation. Ganbatte!

##### Share on other sites
At my school (Univ. of Central Florida) one of the professors, Dr. Fernando Gomez, has been working on this for a LONG, LONG time. Check out his homepage.

One of his projects...
Quote:
 These areas of research are combined in SNOWY, a project that has been underway for over ten years now and which is being used as a test bed for the ideas in these areas. SNOWY is presently reading articles on animals and people randomly selected from the World Book encyclopedia WorldBook and acquiring knowledge from them, and answering questions about the knowledge it has acquired.

I took his AI class... You can give his system new knowledge by simply inputting statements of fact as sentences (ex. "People eat food."). It then uses these rules to semantically interpret the meaning of the words... for example, "The man ate the sub".. it would know that the sub was a sandwich (food) and not a submarine (vehicle). That is a simple example, but as it acquires knowledge it is able to make better sense of awkwardly structures sentences, etc.

If you want to get into this stuff, I suggest familiarizing yourself with Wordnet and evetually learning lisp if you don't know it. Actually, Wordnet is pretty darned cool in its own right... read more about it here.

##### Share on other sites
Quote:
 "Danny threw the coffee mug at the wall and it broke into pieces"- what broke? the mug or the wall?Note in a glass house the wall is also an option.The coffee mug comes first in the sentences, so it seems more likely that that is what the sentence focuses on. But, should presentation order determine likelihood? Can you think of an example where it does not?

There are already grammars which will quite accurately parse this sentence. They will determine that Danny is the subject, threw is the verb, mug is the direct object and coffee is a noun-adjective that modifies mug, and that "at the wall" is a prepositional phrase modifying the action--that is that the mug was thrown at the wall. Tools like word-net will determine that "coffee mug" is a man made object from which people drink liquids and that coffee is just such a liquid, etc.

The grammar will break it into its two separate sentences, the second of which is "It broke."

That's where the AI breaks down. It is simply not possible, at this point, to tell a computer how to determine that it is the mug that broke. Do we even understand why we presume to know that it is the mug that broke? It is life experience, and that is simply something that we can't teach computers.

Current NLP systems will simply tell you that either the mug broke or the wall broke.

ANd, in answer to your question, presentation order has nothing to do with the semantic interpretation. I could just as easily say "I threw the mug at the window and it broke." There is no clearly correct interpretation to that one, even for people.

##### Share on other sites
Wow, that's some really interesting stuff he's doing.

With the sentence, "I threw the mug at the window and it broke," what "it" refers cannot even be discerned by a human - unless there is context; unless later it talks about, for instance, shards of glass and pieces of the window pane lying on the floor. Then you would know it's the window because those properties don't belong to a [coffee] mug. But without that contextual information, your mind does in fact record both possible meanings, and both remain until the ambiguity is resolved. You may have a bias, but they're still apparent to and considered by your mind. The computer would need to do the same thing.

That's the reason for the knowledge base and the neural network (i.e., inference engine). The knowledge base would hold information about the properties of objects and concepts necessary in making such judgments. Then the short-term memory model would allow for temporary contexts to be created. The short-term memory model would hold the contextual information, and it (or a subsystem of it) would look in the knowledge base for the properties connected to the concepts/objects. So it uses the connectionist idea.