views:

55

answers:

2

G'day,

I'm creating an FAQ system, and the user needs to be able to see if a similar question has been asked. Just wondering if anyone knows of an scripts (php or javascript preferably, or possibly actionscript) that has some kind of AI that will do this? I've noticed on stackoverflow as a question is typed, related questions are given underneath.

Any advice would be appreciated.

Thank-you.

+2  A: 

This question isn't easily answered without knowing what your database looks like (assuming you have one) or how your site operates.

You could base similarity off of many things:

  1. Share a common category
  2. Share common tags
  3. Share common keywords within their body
    These keywords are often determined after common-words ('and', 'is', 'the', 'it', etc...) are stripped from the string, leaving uncommon words ('C#', 'database', 'questions') to perform lookups with.
  4. Users explicitly declared them similar
  5. etc...

These are all the types of items you should consider when determining similarity. I hope this helps! Return with more specific questions in the future to receive more specific answers.

Jonathan Sampson
Thanks Jonathan. It's a very open question I know, but we do not have anything formed at this stage. I imagine our database will have the following entities: User (user_id, fname, lname, access_level) and Question (question_id, user_id, question, answer) Word (question_id, word).
Angus
+1  A: 

I think the best you can hope for is for a simple search engine: split the question into words and record the words against the question in a rdbms e.g.

Table questions (id, text, ....)

Table words (question_id, word)

Then to get questions similar to a new question with id $x:

SELECT prev.id, prev.text, count(*) AS common_words
FROM questions prev, words prev_words, words curr_words
WHERE curr_words.question_id=$x 
AND curr_words.word=prev_words.word
AND prev_words.question_id=prev.id
GROUP BY id, text
ORDER BY COUNT(*) DESC
LIMIT.....?

You could certainly apply more elaborate comparison methods on the shortlist returned - but this should certainly be the first step.

C.

symcbean
Thanks C. This looks like what I was looking for. I'm also looking at AIML as an option.
Angus