ansaurus

Question

Schema design for when users can define fields

Answer 1

+1 A:

I would suggest you always take a normalized approach to your database schema and then later decided if you need to create a solution for performance reasons. Premature optimization can be dangerous. Premature database de-normalization can be disastrous!

I would suggest that you stick with the original schema and later, if necessary, create a reporting table that is a de-normalized version of your normalized schema.

Andrew Hare 2009-01-09 19:59:08

Answer 2

+1 A:

One change that may or may not help simplify things would be to not link the ResponseAnswers back to the SurveyID. Rather, create an ID per response and per question and let your ResponseAnswers table contain the fields ResponseID, QuestionID, Answer. Although this would require keeping unique Identifiers for each unit it would help keep things a little bit more normalized. The response answers do no need to associate with the survey they were answering just the specific question they are answering and the response information that they are associated.

Bill 2009-01-09 20:08:56

Yes, that makes sense.

Eli 2009-01-09 20:16:19

Answer 3

A:

I created a customer surveys system at my previous job and came up with a schema very similar to what you have. It was used to send out surveys (on paper) and tabulate the responses.

A couple of minor differences:

Surveys were NOT anonymous, and this was made very clear in the printed forms. It also meant that the demographic data in your example was known in advance.
There was a pool of questions which were attached to the surveys, so one question could be used on multiple surveys and analyzed independently of the survey it appeared on.
Handling different types of questions got interesting -- we had a 1-3 scale (e.g., Worse/Same/Better), 1-5 scale (Very Bad, Bad, OK, Good, Very Good), Yes/No, and Comments.

There was special code to handle the comments, but the other question types were handled generically by having a table of question types and another table of valid answers for each type.

To make querying easier you could probably create a function to return the response based on a survey ID and question ID.

Dave 2009-01-09 20:11:16

Good points. Yes, I left out dealing with different types of questions to focus my post, but I will have to tackle that as well plus support for user-defined validation rules.

Eli 2009-01-09 20:14:51

Answer 4

+1 A:

Your first schema is the better choice of the two. At this point, you shouldn't worry about performance problems. Worry about making a good, flexible, extensible design. There are all sorts of tricks you can do later to cache data and make queries faster. Using a less flexible database schema in order to solve a performance problem that may not even materialize is a bad decision.

Besides, many (perhaps most) survey results are only viewed periodically and by a small number of people (event organizers, administrators, etc.), so you won't constantly be querying the database for all of the results. And even if you were, the performance will be fine. You would probably paginate the results somehow anyway.

The first schema is much more flexible. You can, by default, include questions like name and address, but for anonymous surveys, you could simply not create them. If the survey creator wants to only view everyone's answers to three questions out of five hundred, that's a really simple SQL query. You could set up a cascading delete to automatically deleting responses and questions when a survey is deleted. Generating statistics will be much easier with this schema too.

Here is a slightly modified version of the schema you provided. I assume you can figure out what data types go where :-)

    surveys
      survey_id (index)
      title

    questions
      question_id (index, auto increment)
      survey_id (link to surveys->survey_id)
      question

    responses
      response_id (index, auto increment)
      survey_id (link to surveys->survey_id)
      submit_time

    answers
      answer_id (index, auto increment)
      question_id (link to questions-question_id)
      answer

William Brendel 2009-01-09 20:23:39

ansaurus

tags:

views:

answers:

Schema design for when users can define fields

related questions