views:

1812

answers:

2

Hello all,

I have an interesting conundrum which I believe can be solved in purely SQL. I have tables similar to the following:

responses:

user_id | question_id | body
----------------------------
1       | 1           | Yes
2       | 1           | Yes
1       | 2           | Yes
2       | 2           | No
1       | 3           | No
2       | 3           | No


questions:

id | body
-------------------------
1 | Do you like apples?
2 | Do you like oranges?
3 | Do you like carrots?

and I would like to get the following output

user_id | Do you like apples? | Do you like oranges? | Do you like carrots?
---------------------------------------------------------------------------
1       | Yes                 | Yes                  | No
2       | Yes                 | No                   | No

I don't know how many questions there will be, and they will be dynamic, so I can't just code for every question. I am using PostgreSQL and I believe this is called transposition, but I can't seem to find anything that says the standard way of doing this in SQL. I remember doing this in my database class back in college, but it was in MySQL and I honestly don't remember how we did it.

I'm assuming it will be a combination of joins and a GROUP BY statement, but I can't even figure out how to start.

Anybody know how to do this? Thanks very much!

Edit 1: I found some information about using a crosstab which seems to be what I want, but I'm having trouble making sense of it. Links to better articles would be greatly appreciated!

A: 

There is an example of this in contrib/tablefunc/.

Peter Eisentraut
Umm, where is `contrib/tablefunc`? Are you talking about a directory on the doc server?
Topher Fangio
It's in that directory in the source tree, or you might find a `postgresql-contrib` binary package that you need to install that contains it.
Peter Eisentraut
+4  A: 

Use:

  SELECT r.user_id,
         MAX(CASE WHEN r.question_id = 1 THEN r.body ELSE NULL END) AS "Do you like apples?",
         MAX(CASE WHEN r.question_id = 2 THEN r.body ELSE NULL END) AS "Do you like oranges?",
         MAX(CASE WHEN r.question_id = 3 THEN r.body ELSE NULL END) AS "Do you like carrots?"
    FROM RESPONSES r
    JOIN QUESTIONS q ON q.id = r.question_id
GROUP BY r.user_id

This is a standard pivot query, because you are "pivoting" the data from rows to columnar data.

OMG Ponies
So you're saying I have to build a dynamic query based on the number of questions that I have? I guess I could do that, but I was hoping for a more simple solution.
Topher Fangio
OMG Ponies
@OMG Ponies - Thanks for the answer. Looks like this will be the easiest to implement even if I have to generate the query at runtime.
Topher Fangio
This query doesn't work in PostgreSQL. The single quotes have to be double quotes.
Peter Eisentraut