views:

258

answers:

6

This question is related to this post: http://stackoverflow.com/questions/1764469/sql-design-for-survey-with-answers-of-different-data-types

I have a survey app where most questions have a set of answers that are 1-5. Now we have to do questions that could have a variety of different answer types -- numeric, date, string, etc. Thanks to suggestions from stack, I went with a string column to store answers. Some questions are multiple choice, so along with the table 'questions', I have a table 'answers' which has the set of possible answers for a question.

Now: how should I store answers for a question that is "pick all that apply"? Should I make a child table that is "chosen_answers" or something like that? Or should the answers table have a 'chosen' column that indicates that a respondent chose that answer?

+3  A: 

a possible solution is a UsersAnswers table with 4 columns: primary key, user's id, question's id, and answer's id

with multiple entries for any questions where more than one answer can be selected

Jimmy
A: 

I have two suggestions.

  1. Normalize your database, and create a table called question_answer, or something that fits more in line with the nomenclature of your schema. This is how I would lay it out.

    CREATE TABLE question_answer (
        id INT NOT NULL AUTO INCREMENT PRIMARY KEY,
        user_id INT NOT NULL,
        question_id INT NOT NULL,
        answer_id INT NOT NULL
    );
    
  2. Create five columns in your answers table, each of which refers to a specific answer. In MySQL I would use set these columns up as bit(1) values.

IMHO, unless you see the number of choices changing, I would stick with option 2. It's a faster option and will most likely also save you space.

Dan Loewenherz
A: 

As you're not going to have many options selected I'd be tempted to store the answers as a comma-separated list of values in your string answer column.

If the user is selecting their answers from a group of checkboxes on the web page with the question (assuming it is a web app) then you'll get back a comma-separated list from there too. (Although you won't just be able to compare the lists as strings since the answer "red,blue" is the same as "blue,red".)

Dave Webb
Yes, this is very tempting, :D though Jimmy's looks like the right one.
Dr. Xray
This idea struck me also, but it seems problematic when I try to run aggregating reports -- if I wanted to get the unique values and their respective counts for a set of surveys, that would involved some majro string parsing.
If you did want to run a report like "Most common wrong answers for pick-all-that-apply questions" this could be a problem.
Dave Webb
A: 

You can use a character (sequence) as a "flag" to store the data (e.g. comma-separated values). If you can't use a single character to split the data, like a comma, then you might want to use other characters, or even sequences of obscure ones (e.g. !@#$).

Edit: Mr. Webb beat me by a few seconds. :P

Breakthrough
A: 

Another option, (And I've seen cases where this was how questions like this were scored as well), is to treat each possible answer as a separate Yes/No question, and record the testee's response (Chose it, or didn't) as a boolean...

Charles Bretana
A: 

these survey questions always have one, universal answer: it depends on what you want to do with the answers when you're done.

for example, if all you want to to is keep a record of each individual answer (and never do any totaling or find all users that answered question x with answer y), then the simplest design is to denormalize the answers in to a serialized field.

if you need totals, you can probably also get away with denormalized answers in to a serialized table if you calculate the totals in a summary table and update the values when a quiz is submitted.

so for your specific question, you need to decide if it's more useful to your final product to store 5 when you mean "all of the above" or if it's more useful to have each of the four options individually selected.

longneck
We'll want to run reports that sum up the counts of answers for given time periods ( and other attributes) . We don't have an 'all of the above' question, but I can easily forsee a "pick all ( read: any ) that apply"