views:

147

answers:

4

Based on the StackOverflow data dump, it seems that S.O. represents questions and answers as a single table - Posts.

However, a question has a title, a body and tags associated with while an answer only has a body. To me, at least, this indicates that they are distinct enough that they should be separate tables.

Besides, I don't like having to write "and type='question'" in my SQL.

Are these valid reasons?

Or is there a good reason for putting questions and answers into the same table?

+2  A: 

I would separate them on principle, just because they are different beasts. Question have titles (as you said), tags, favorite markers and (presumably) are subject to searches for attempted duplicate detection.

That would seem to me to make them different enough to warrant a separate table.

However, we don't know how SO stores them in the database, you've only seen the export into the data dump - it may be that the export functionality combines questions and answers into posts.

It may also be that the information common to questions and answers are stored in a single table and the question-specific extras stored in another table. Short of asking the SO developers, I can't think of any way to confirm this.

paxdiablo
+2  A: 

Questions and answers have a lot in common -- author, date, comments, &c. Separating the table (since SQL schemas typically don't support inheritance) means a lot of duplication (the comments table will likely also have to be split, or have a goofy design with two foreign keys, one to the Q table and one to the A table, of which exactly one is to be non-NULL).

Yes, there are distinctions too between Q & A, and advantages the other 'way round, too, as you point out. "You pays your money, you takes your choices"!-)

Alex Martelli
I would do comments with polymorphism, not with two foreign keys.In the future, tags, profiles, badges, users, etc might all also have comments on them (there's no semantic restriction); mutually exclusive required foreign keys just make for kludges.
Sai Emrys
@Sai, re "in the future", the right time to design that in is, well, in the future (YAGNI). Building functionality into a product that has no current need for it (and possibly never will) is wasted effort.
paxdiablo
http://en.wikipedia.org/wiki/You_Ain%27t_Gonna_Need_It Yagni sounds like a good excuse to avoid polymorphism for me - I find it confusing as all heck.
Charlie K
polymorphism in OO programming is fine (it's the heart of their operation) but in relational DBs there is currently no standard way to have a foreign key column point to several possible tables and yet keep integrity constraints... I guess eventually one gives up the latter (hey, MyISAM never had them, right?-), but not happily!-)
Alex Martelli
A: 

It is better to have questions and Answers in seperate table. You can map them using the Question ID (Ex: 985113 in this question)

There is an option to close questions (>3k rep) in SO and one of the reason is duplicate question. We have to enter the question ID or part of question. Remeber searchinif both answers and questions are in the same table.

Shoban
+2  A: 

Actually I think we've heard enough hints on the podcast to suggest that they are stored in the same table - also it looks like the ID numbers for questions and answers do not overlap. Maybe they did it for performance reasons? It will be possible for example to fill the data for a page like this one in a single scan of the posts table, rather than one scan of questions and one of answers.

1800 INFORMATION