ansaurus

Question

SQL multiple join on many to many tables + comma separation

Answer 1

+4 A:

SQLite doesn't have the "pivot" functionality you'd need for starters, and the "comma separated values" part is definitely a presentation issue that it would be absurd (and possibly unfeasible) to try to push into any database layer, whatever dialect of SQL may be involved -- it's definitely a part of the job you'd do in the client, e.g. a reporting facility or programming language.

Use SQL for data access, and leave presentation to other layers.

How you get your data is

SELECT media.id, media.uri, people.name, people.role
FROM media
JOIN media_to_people ON (media.id = media_to_people.media_id)
JOIN people ON (media_to_people.people_id = people.id)
WHERE media.id = ?
ORDER BY people.role, people.name

(the ? is one way to indicate a parameter in SQLite, to be bound to the specific media id you're looking for in ways that depend on your client); the data will come from the DB to your client code in several rows, and your client code can easily put them into the single column form that you want.

It's hard for us to say how to code the client-side part w/o knowing anything about the environment or language you're using as the client. But in Python for example:

def showit(dataset):
  by_role = collections.defaultdict(list)
  for mediaid, mediauri, name, role in dataset:
    by_role[role].append(name)
  headers = ['mediaid', 'mediauri']
  result = [mediaid, mediauri]
  for role in sorted(by_role):
    headers.append('people(%s)' % role)
    result.append(','.join(by_role[role]))
  return ' '.join(headers) + '\n' + ' '.join(result)

even this won't quite match your spec -- you ask for headers such as 'people(artist)' while you also specify that the role's encoded as an int, and mention no way to go from the int to the string 'artist', so it's obviously impossible to match your spec exactly... but it's as close as my ingenuity can get;-).

Alex Martelli 2009-06-10 05:37:24

+1 for recommending post-processing of the rows in a client application language!

Bill Karwin 2009-06-10 05:43:16

Hmm, so there is no way to fetch the people.name columns specifically for different roles?I was thinking something like this:SELECT media.id, media.uri, artists.name, publishers.name FROM media JOIN media_to_people ON media_to_people.media_id = media.id JOIN people AS artists, people AS publishers ON (artists.id = media_to_people.people_id AND artists.role = 2) OR (publishers.role = 8 AND publishers.id = media_to_people.people_id)But this doesn't work as intended...

jetru 2009-06-10 05:49:15

Well the header name is not so important as getting it in a row. I want all the stuff done in one query.I'm writing C. You can imagine why I want to avoid post processing my results. ;)

jetru 2009-06-10 05:54:41

@Alex: I'm so sorry I wrote my answer. My intention was to show the OP how inappropriate it would have to be to write a single SQL query to do the whole job. I assumed if I marked my answer as 'community wiki' jetru wouldn't be able to accept it.

Bill Karwin 2009-06-10 06:37:23

@Bill, heh, NP -- your answer had excellent code and explanation, what mode can you do?-)

Alex Martelli 2009-06-10 14:57:04

Hmm. any possibility of accepting two answers? :D I needed that query because I need to get a module working fast(because there are other parts of the module that rely on this statement) and that query does the job exactly. Besides the database is on the local system for every user. Perhaps later down the line, when I have loads of time, I will replace it with post processing the data. :)

jetru 2009-06-10 20:00:46

Answer 2

+1 A:

I agree with Alex Martelli's answer, that you should get the data in multiple rows and do some processing in your application.

If you try to do this with just joins, you need to join to the people table for each role type, and if there are multiple people in each role, your query will have Cartesian products between these roles.

So you need to do this with GROUP_CONCAT() and produce a scalar subquery in your select-list for each role:

SELECT m.id, m.uri, 
 (SELECT GROUP_CONCAT(name) 
  FROM media_to_people JOIN people ON (people_id = id) 
  WHERE media_id = m.id AND role = 1) AS Actors,
 (SELECT GROUP_CONCAT(name) 
  FROM media_to_people JOIN people ON (people_id = id) 
  WHERE media_id = m.id AND role = 2) AS Artists,
 (SELECT GROUP_CONCAT(name) 
  FROM media_to_people JOIN people ON (people_id = id) 
  WHERE media_id = m.id AND role = 3) AS Publishers
FROM media m;

This is truly ugly! Don't try this at home!

Take our advice, and don't try to format the pivot table using only SQL.

Bill Karwin 2009-06-10 05:42:02

This works perfectly! This is exactly what I need! Does it hit performance hard?

jetru 2009-06-10 05:52:09

Yes! It hits performance hard!! That's why I told you not to do it! In addition to it being ugly, unmaintainable code.

Bill Karwin 2009-06-10 06:33:15

ansaurus

tags:

views:

answers:

SQL multiple join on many to many tables + comma separation

related questions