Creating a smart text generator

I'm doing this for fun (or as 4chan says "for teh lolz") and if I learn something on the way all the better. I took an AI course almost 2 years ago now and I really enjoyed it but I managed to forget everything so this is a way to refresh that.

Anyway I want to be able to generate text given a set of inputs. Basically this will read forum inputs (or maybe Twitter tweets) and then generate a comment based on the learning.

Now the simplest way would be to use a Markov Chain Text Generator but I want something a little bit more complex than that as the MKC basically only learns by word order (which word is more likely to appear after word x given the input text). I'm trying to see if there's something I can do to make it a little bit more smarter.

For example I want it to do something like this:

Learn from a large selection of posts in a message board but don't weight it too much
For each post:
- Learn from the other comments in that post and weigh these inputs higher
- Generate comment and post
- See what other users' reaction to your post was. If good weigh it positively so you make more posts that are similar to the one made, and vice versa if negative.

It's the weighing and learning from mistakes part that I'm not sure how to implement. I thought about Artificial Neural Networks (mainly because I remember enjoying that chapter) but as far as I can tell that's mainly used to classify things (i.e. given a finite set of choices [x1...xn] which x is this given input) not really generate anything.

I'm not even sure if this is possible or if it is what should I go about learning/figuring out. What algorithm is best suited for this?

To those worried that I will use this as a bot to spam or provide bad answers to SO, I promise that I will not use this to provide (bad) advice or to spam for profit. I definitely will not post it's nonsensical thoughts on SO. I plan to use it for my own amusement.

Thanks!

This is a good idea and it will hopefully produce more grammatically correct sentences which have more of a chance of working but I was looking to train the algorithm so that based on the training data it is more likely to produce sentences that make sense. So an idea was that the Markov Chain produces a sentence I can decide if it's positive or negative and based on that it can re-weigh the training data. But the issue is that then it will tend to the exact same sentences most of the time. I don't want the exact same but just the same structure or meaning.

royrules22 2010-05-28 17:38:46

ansaurus

tags:

views:

answers:

Creating a smart text generator

related questions