"AI" to Summarize a Document

Started by
11 comments, last by Timkin 19 years, 9 months ago
you could also want to make it the so everlasting project of the self-learning AI so that when it reads a page it categorizes it with sentences that are alike to other sites it once read and then, you can say he has no understanding, but he can say it is related to expressions such as....... blablabla
Advertisement
Maybe have a list of words/phrases (call is l1) with how good they are (l2). (you can get these like a Bayesian spam filter, you teach it, then according to the output refine the values).

You then assign a raating per sentance of the document (so something like:
"the man was killed in a car crash", once you strip off all the "bad" words (cannot be used to catagorise it) you get
"man killed car crash"

Then say if "man" has a interesting rating of 0.4,
"killed" has an interesting rating of 0.8
and "Car crash" has in interesting rating of 0.7 then

All in all this sentence is: 179.2% readable... which means that it would be put on a bank of sentences...

you then decompose sentences into objects, subjects and actions, so in that egsample:

man - subject
killed - action
car crash - object (or we can treat it as one.. :))

You now have a list of subjects, objects and actions from which you gen generate meaningful sentences from it...

so now it would be something like:
(a man was killed in a car crash) at (The newtown rodio) when (listening to the radio).

The expressions in ()'s are individial sentences, with extra words added acording to what the generated sentences are (rule based system anyone?)...

One is grateful to be of service,
DENC

edit-Changed spelling of Bayesian - Thanks timkin!

[Edited by - Nice Coder on July 30, 2004 11:03:04 PM]
Click here to patch the mozilla IDN exploit, or click Here then type in Network.enableidn and set its value to false. Restart the browser for the patches to work.
Quote:Original post by Nice Coder
(you can get these like a bayonisian spam filter, you teach it, then according to the output refine the values).


For the record, it's 'Bayesian', in reference to the field of probabilistic inference derived from the work of the Rev. Thomas Bayes.

Cheers,

Timkin

This topic is closed to new replies.

Advertisement