Jump to content

  • Log In with Google      Sign In   
  • Create Account

We're offering banner ads on our site from just $5!

1. Details HERE. 2. GDNet+ Subscriptions HERE. 3. Ad upload HERE.


Simple mechanisms for low-budget natural language generation


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
2 replies to this topic

#1 ApochPiQ   Moderators   -  Reputation: 16079

Like
0Likes
Like

Posted 27 January 2014 - 01:32 PM

I'm hacking around on an IRC bot in my spare time, mostly as an interesting exercise in Javascript. It has some basic functionality but it just lacks that special something... so I want to teach it to talk.

 

Before we get too far into this, I should say that I'm fully aware that NLG is a massive field of research, and I'm not trying to pass any Turing tests here. I don't care if the generated "speech" even makes sense half the time; it's more for amusement than anything else.

 

My first inclination was to build a Markov model and use simple chains to construct sentences. Unfortunately, the space complexity of this is rather nasty, and the real killer is the amount of data needed to train the model adequately. I don't have a readily available corpus of plaintext to feed into the thing that suits the mood and personality I want to create.

 

The next obvious route would be to construct a Petri net for the language I want to speak. The major advantage is that this is a compact and fairly efficient way to do poor-man's NLG; the disadvantage is that hand-authoring and tuning a Petri net for nontrivial languages can be a huge time sink.

 

 

So I figured I'd poke around here and see if anyone knows of good algorithms for simple NLG that I might be able to take advantage of. I don't mind having to use a huge data set as long as the data is easily constructed and/or readily available in an easily digested format. Runtime is important since this is supposed to be a realtime conversational bot.

 

Non-goals: contextual recognition, memory, progressive refinement/learning, etc. It doesn't even have to do more than dumb keyword recognition for all I care.

 

 

Cheers!



Sponsor:

#2 alexjc   Members   -  Reputation: 450

Like
0Likes
Like

Posted 27 January 2014 - 01:46 PM

I'd second your Markov model idea, and somehow try to work around the training problem.

 

If you build a simple semantic model using WordNet for example, you could reduce your training data required significantly.  So you'd end up learning at the high-level, <pronoun> <verb> <noun>, or possibly more detailed like <pronoun> <eat> <vegetable>.  I'm not sure how good NLP / NLG libraries are for Javascript but there are some awesome ones in Python that could help with this.

 

Anyway, cool project ;-)


Join us in Vienna for the Game/AI Conference 2014, on July 7-10... Don't miss it!


#3 ikarth   Members   -  Reputation: 442

Like
0Likes
Like

Posted 27 January 2014 - 08:59 PM

You might want to look up what was done for the NaNoGenMo project (look on Github). It might give you a few ideas of some of the different approaches.






Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS