"AI" to Summarize a Document
you could also want to make it the so everlasting project of the self-learning AI so that when it reads a page it categorizes it with sentences that are alike to other sites it once read and then, you can say he has no understanding, but he can say it is related to expressions such as....... blablabla
Maybe have a list of words/phrases (call is l1) with how good they are (l2). (you can get these like a Bayesian spam filter, you teach it, then according to the output refine the values).
You then assign a raating per sentance of the document (so something like:
"the man was killed in a car crash", once you strip off all the "bad" words (cannot be used to catagorise it) you get
"man killed car crash"
Then say if "man" has a interesting rating of 0.4,
"killed" has an interesting rating of 0.8
and "Car crash" has in interesting rating of 0.7 then
All in all this sentence is: 179.2% readable... which means that it would be put on a bank of sentences...
you then decompose sentences into objects, subjects and actions, so in that egsample:
man - subject
killed - action
car crash - object (or we can treat it as one.. :))
You now have a list of subjects, objects and actions from which you gen generate meaningful sentences from it...
so now it would be something like:
(a man was killed in a car crash) at (The newtown rodio) when (listening to the radio).
The expressions in ()'s are individial sentences, with extra words added acording to what the generated sentences are (rule based system anyone?)...
One is grateful to be of service,
DENC
edit-Changed spelling of Bayesian - Thanks timkin!
[Edited by - Nice Coder on July 30, 2004 11:03:04 PM]
You then assign a raating per sentance of the document (so something like:
"the man was killed in a car crash", once you strip off all the "bad" words (cannot be used to catagorise it) you get
"man killed car crash"
Then say if "man" has a interesting rating of 0.4,
"killed" has an interesting rating of 0.8
and "Car crash" has in interesting rating of 0.7 then
All in all this sentence is: 179.2% readable... which means that it would be put on a bank of sentences...
you then decompose sentences into objects, subjects and actions, so in that egsample:
man - subject
killed - action
car crash - object (or we can treat it as one.. :))
You now have a list of subjects, objects and actions from which you gen generate meaningful sentences from it...
so now it would be something like:
(a man was killed in a car crash) at (The newtown rodio) when (listening to the radio).
The expressions in ()'s are individial sentences, with extra words added acording to what the generated sentences are (rule based system anyone?)...
One is grateful to be of service,
DENC
edit-Changed spelling of Bayesian - Thanks timkin!
[Edited by - Nice Coder on July 30, 2004 11:03:04 PM]
Quote:Original post by Nice Coder
(you can get these like a bayonisian spam filter, you teach it, then according to the output refine the values).
For the record, it's 'Bayesian', in reference to the field of probabilistic inference derived from the work of the Rev. Thomas Bayes.
Cheers,
Timkin
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement