views:

27

answers:

2

I've been assigned to build a questions & answers widget for my company. Many different departments in the company will want to use this widget on various websites and they will want to filter questions based on specific and unique criteria.

The battle here is scalability vs efficiency.

Should I:

A) Make unique mapping tables in the database per each criteria? For example:

table questions (q_id,q_question,q_details,q_poster...)
table questions_criteria_a (q_id,criteria_a.id)
table questions_criteria_b (q_id,criteria_b.id)
table questions_criteria_c (q_id,criteria_c.id)

That would allow me to SELECT * FROM questions q LEFT JOIN questions_criteria_a qca ON q.q_id = qca.q_id LEFT JOIN .... etc...

My problem with that is that I have no idea what the required criteria for each department actually is or may be in the future - that will mean that I will need to add a new table every time new criteria is presented. Examples of criteria could be state, city, subject, vendor key, etc. They would want to display all questions relating to a vendor and are relevant to San Jose, CA, for example. Any, all, or none of the criteria would be required per query - it would be up to the departments to code their own data fetching logic.

B) Have each department provide criteria flagging logic which will be stored in the questions table as a json string or serialized data. For example:

table questions (q_id,question,q_details,q_poster,q_criteria...)
-- the criteria would look like {'state':'CA','city':'San Jose','vendor_key':'13144'}

So, the obvious advantage of (B) is that the data storage logic is scalable and consistent - the obvious advantage of (A) is the queries will be much faster than doing a SELECT * from questions where q_criteria LIKE "%'city':'San Jose'%" AND q_criteria LIKE "%'state':'CA'%" etc.....

Ideas? Thoughts? Feedback?

If you have a better solution that was not presented above, I'd love to consider it.

Thanks! --S--

+1  A: 

I'd define three tables -

QUESTIONS

  • question_id (pk, auto_increment)

CRITERIA

  • criteria_id (pk, auto_increment)
  • criteria_description

QUESTIONS_CRITERIA

  • question_id (pk, foreign key to QUESTIONS.question_id)
  • criteria_id (pk, foreign key to CRITERIA.criteria_id)

Less JOINs, and infinitely capable of holding criteria associations. Criteria could be sub categorized if necessary.

OMG Ponies
Very cool idea with the composite primary key. Thank you.
Sean Cannon
@Sean Cannon: Yes, ensures no duplicates :)
OMG Ponies
+1  A: 

I think you are probably looking for some many-to-many relationships so that you can abstract the types of data from the criteria they belong to, without creating all the extra tables.

Something like a 'criteria_to_question' table where you can map multiple criteria to a question, should keep your db clean and scalable.

Steve-O-Rama