views:

309

answers:

3

I am working on an online survey. Most questions have a scale of 1-5 for an answer. If we need to add a question to the survey, I use a simple web form, which does an INSERT into the appropriate table, and voila! surveys are asking the new question -- no new code or change to the database structure.

We are being asked to add survey questions that can have answers of different data types. The spec is to have the survey 'configurable', so that at any point in the future, when someone says, "We need a new survey that asks {text answer question}, {1-5 question}, {true false question}, {question with a date as an answer}", we can do that without changing the database structure .

I'm trying to think of the best way to store those answers, but every method I come up with seems somewhat hackish.

Some questions may have a yes/no or true/false answer, some may have an integer answer ( "How many times in the past month have you used tech support?"), another answer may have a date, a string, a multiple choice with a single value, a multiple choice with multiple values, etc. Or sometimes, a particular answer value may prompt a sub-question ( "What disappointed you about...?" )

The simple solution is to have each question as a column in the survey, its answer as a column in the survey, and whether to ask it as a column in the survey. This feels like a mess to me -- it's one big table; not very 'relational'.

Brainstorming, the 'best' method I could come up with is to have a different table for each answer type, but that feels susceptible to data integrity issues. In other words, I would have

CREATE TABLE `Questions` (...);
CREATE TABLE `TrueFalseAnswers` (...);
ALTER TABLE `TrueFalseAnswers`
  ADD CONSTRAINT `TrueFalseAnswers_ibfk_1` FOREIGN KEY (`question_id`)
REFERENCES `Questions` (`id`) ON DELETE CASCADE ON UPDATE CASCADE;
CREATE TABLE `TextAnswers` (...);
ALTER TABLE `TextAnswers`
  ADD CONSTRAINT `TextAnswers_ibfk_1` FOREIGN KEY (`question_id`)
REFERENCES `Questions` (`id`) ON DELETE CASCADE ON UPDATE CASCADE;

etc.

One problem with the above is that I can't guarantee that at least one and only one answer exists for any question in the DDL alone.

Another solution might be to have a binary or string column for the answer in the Questions table, and encode all answers into some string or binary format, and store them in the answer column. That gives me at least one and only one answer for every question, but then I lose access to aggregate features in SQL queries. This strikes me as not a very 'relational' solution.

So, I see problems with the ideas outlined above. Is there a 'best' way to solve this problem?

Now that I've taken the time to verbalize the problem and my thoughts, it seems that the broad problem that I'm posing is "I'd like to store data of an arbitrary type without doing any coding..." Is this hopeless?

I am using MySQL so I don't have access to thing that other RDBMSes might.

+1  A: 

You want to create a QuestionType table that corresponds to a QuestionType class. Allow the persisted Answer filled in by your users to be free-form text, and leave it up to the QuestionType to determine what the answer means.

So- if it's true/false, the Answer could be 'T' or 'F'.

If it's multiple choice, the Answer could be the index of the selected choice.

If it's a text box the users fills in, save the text they enter.

Dave Swersky
+4  A: 

Use a column that specifies the type of answer, but store the answer as text. Your application or front end can use the answer_type column to determine what to display to the end user (a test box, radio buttons, a date picker) and how to validate it before sending it back to the database.

Larry Lustig
This is EXACTLY what the Questionnaire system in our software does and it works just fine. As Randolpho says, there isn't a perfect solution using normal relational databases, it's up to good code to sort it out.
_J_
A: 

that I'm posing is "I'd like to store data of an arbitrary type without doing any coding..." Is this hopeless?

Yes, it pretty much is. There is no "good" solution to the problem you're posing. The "best" is as Dave Swersky and Larry Lustig described it:

A Question table, which stores the question, possible answers (if it's multiple choice) and a question type

An Answer table, which stores the answer to a question (FK to Question table), serialized as text. Varchar(4000) or TEXT datatype, preferably the former unless absolutely necessary.

It's up to your application logic to determine what the value means based on the type specified for the question.

Randolpho