views:

159

answers:

2

I'm sure this is a common query but I'm not certain how to phrase it best.

I have two tables:

Questions { Id, Text... }
Answers { Id, QuestionId, Text...}

I would like to retrieve a list of Questions and their Answers, so that the results can be displayed like this:

  • Question A
    • 1st answer to question A
    • 2nd answer to question A
    • 3rd answer to question A
  • Question B
    • 1st answer to question B ... etc

Assume that there is some selection criteria on the Question table.

A very inefficient method would be to select all the relevant questions, then for each select all the answers. Another inefficient method would be to use a LEFT OUTER JOIN.

What's the most efficient, simplest way of getting the questions and answers? Can it be done in one query?

+2  A: 

For MySQL (not MSSQL) Something like this would work;

SELECT q.text, a.text
FROM questions q
LEFT JOIN answers a ON 
    q.quiestionid = q.id
GROUP BY q.id, a.id
ORDER BY q.id

Let me know if you have any q's.

Christian
Why don't you think that'd work on MSSQL? You just wrote ANSI SQL. The syntax will work fine in SQL Server.
Eric
afaik, an ANSI-conforming rdbms requires that columns on SELECT clause should also appear on GROUP BY clause. so q.text, and a.text needed be included on GROUP BY clause. christian's example won't work on PostgreSQL and MSSQL. above will only work on pg and mssql using this: GROUP BY q.id, a.id, q.text, a.text
Michael Buen
but as the .id fields are primary keys and aren't going to be repeated, why not just drop the useless group by?!
Alex Martelli
@Michael: You're right. I mentally skipped over them because they are absolutely useless in this case (as Alex points out).
Eric
+1  A: 

Why do you think a left join would be inefficient? You'll get some duplicate data on questions that were answered many times, but that's just a few extra bytes over the wire, nothing to worry about.

One answer that was given works fine (in any real sql engine including mssql, sqlite, etc, as well as mysql as it was proposed for) but is redundant (it has a group-by on primary keys that don't get duplicated anyway). So the following simpler and typo-fixed version is fine and fast:

SELECT q.id, q.text, a.id, a.text
  FROM questions q
  LEFT JOIN answers a ON a.questionid = q.id
 ORDER BY q.id

Your client code must simply notice when q.id changes in order to group and display things "hierarchically" as you desire -- how to do it depends on the client-side language you use, in Python for example you'd use itertools.groupby to do it very simply (you don't need the group by on the SQL server side, but you do need it client-side, whether with a language supplied facility like in Python or by implementing it yourself, to get the hierarchy displayed as you desire).

Alex Martelli
Yeah I was thinking the inefficiency would be in the duplicate data - assuming there could be 20 to 100 answers in the worst case. Given the simplicity I'll try it anyway and defer further optimisation once I've tested the performance.
cbp