tags:

views:

241

answers:

5

I'm trying to look up two pieces of metadata (volume and issue) on items which may have either a volume, issue, or both. The metadata is stored in a table with item ID, key (metadata field ID) and value.

This does work, but it seems overly complex and repetitive:

select volume.text_value as volume_value, issue.text_value as issue_value
    from metadatavalue item
    left outer join (select item_id, text_value from metadatavalue
                     where metadata_field_id = 90) volume
                    on item.item_id = volume.item_id
    left outer join (select item_id, text_value from metadatavalue
                     where metadata_field_id = 91) issue
                    on item.item_id = issue.item_id
    where item.metadata_field_id in (90, 91)

Is there a simpler way to write this query?

Thanks.

+2  A: 
SELECT  DISTINCT ON (item_id)
        item_id,
        CASE metadata_field_id
        WHEN 90 THEN
                text_value
        ELSE    (
                SELECT  text_value
                FROM    metadatavalue m
                WHERE   m.metadata_field_id = 90
                        AND m.item_id = i.item_id
                )
        END AS volume,
        CASE metadata_field_id
        WHEN 91 THEN
                text_value
        ELSE    (
                SELECT  text_value
                FROM    metadatavalue m
                WHERE   m.metadata_field_id = 91
                        AND m.item_id = i.item_id
                )
        END AS issue
FROM    metadatavalue
WHERE   metadata_field_id IN (90, 91)
ORDER BY
        item_id

Having an index on (item_id, metadata_field) will improve this query.

This will work better if there are few items with metadata of 90 and 91, compared to overall number of items.

If almost all items have these metadata, just use:

SELECT  *
FROM    (
        SELECT  item_id,
                (
                SELECT  text_value
                FROM    metadatavalue m
                WHERE   m.metadata_field_id = 90
                        AND m.item_id = i.item_id
                ) volume,
                (
                SELECT  text_value
                FROM    metadatavalue m
                WHERE   m.metadata_field_id = 91
                        AND m.item_id = i.item_id
                ) issue
        FROM    items
        ) q
WHERE   issue IS NOT NULL OR volume IS NOT NULL
Quassnoi
Good point, thanks. The first query seems to just be a restatement of mine, but better formatted :-) There are indeed few items with volume/issue.
Nicholas Riley
@Nicholas: Ummm... I corrected the query a little to avoid duplicates.
Quassnoi
@Nicholad: and yet a little, to avoid extra SELECT's where they are not needed.
Quassnoi
I was referring to your pre-edit version. (The DISTINCT was a red herring, I rewrote the query and didn't test it fully before I posted. Sorry.) Your new query gives me 'ERROR: syntax error at or near "volume"' (line 12) though.
Nicholas Riley
@Nicholas: corrected, forgot an AS before the alias. Nevermind, the FULL OUTER JOIN proposed by @Paul Morgan will be more efficient anyway, provided there's an index on (metadata_field_id, item_id)
Quassnoi
+2  A: 

Try this

select  volume.text_value as volume_value, 
     issue.text_value as issue_value    
from    metadatavalue item    
     left outer join metadatavalue volume                    
      on item.item_id = volume.item_id    
     left outer join metadatavalue issue                    
      on item.item_id = issue.item_id    
where   volume.metadata_field_id = 90
and  issue.metadata_field_id = 91
John MacIntyre
This doesn't work (it is similar to my original attempt at this query). It behaves as if the joins were inner joins rather than outer joins.
Nicholas Riley
Sorry, I might have misinterpretted your question. "Is there a simpler way to write this query?" ... I thought it was just a simpler query you wanted.
John MacIntyre
@Nicholas: If this is not what you asked, then you need to word your question better. You asked for simplification of your query and this does that.
BenAlabaster
@balabaster: a query that doesn't work is not a simplification, it's just wrong. (No offense to the answer intended.) Judging by the correct answers I got, I don't think my question was terribly ambiguous.
Nicholas Riley
A: 

Using the case statement to segregate the values into buckets, the max function will allow the text_value data to float to the top.

select
   item_id
  ,max(case metadata_field_id when 90 then text_value else null end) as volume_value
  ,max(case metadata_field_id when 91 then text_value else null end) as issue_value
from
  metadatavalue
group by
   item_id

Here's what I used for testing purposes:

select
   item_id
  ,max(case metadata_field_id when 90 then text_value else null end) as volume_value
  ,max(case metadata_field_id when 91 then text_value else null end) as issue_value
from
  (
   select 1 as item_id, 90 as metadata_field_id, '90-I am here' as text_value
   union
   select 1 as item_id, 91 as metadata_field_id, '91-Me too' as text_value
   union
   select 2 as item_id, 90 as metadata_field_id, null as text_value
   union
   select 2 as item_id, 91 as metadata_field_id, '91-funky' as text_value
   union
   select 3 as item_id, 90 as metadata_field_id, '90-fresh' as text_value
  ) metadatavalue
group by
   item_id

Results:

item_id  volume_value  issue_value
1        90-I am here  91-Me too
2        NULL          91-funky
3        90-fresh      NULL

Note: I used SQL Server to model & test and then changed the syntax to match Postgres.

tom
A: 

You can take the predicate out of your inner table into the join condition:

select volume.text_value as volume_value, issue.text_value as issue_value
from metadatavalue item
left outer join metadatavalue volume
                on volume.metadata_field_id = 90 and volume.item_id = item.item_id
left outer join metadatavalue issue
                on issue.metadata_field_id = 91 and issue.item_id = item.item_id
where item.metadata_field_id in (90, 91)

Although I'm not sure I'm making sense of your schema-- it seems to me that item should be referring to a different table than volume and issue? sth more like:

select item.*, volume.text_value as volume_value, issue.text_value as issue_value
from item
left outer join metadatavalue volume
                on volume.metadata_field_id = 90 and volume.item_id = item.item_id
left outer join metadatavalue issue
                on issue.metadata_field_id = 91 and issue.item_id = item.item_id

This also makes the rather odd where condition go away. You could put the above query into a view ("item_volume_issue") and it would appear that volume and issue are just additional columns in item_volume_issue instead of separate tables (which is what makes more sense to me).

araqnid
The where condition is necessary to limit the query to those items which have a volume or issue. (Not all items do; I was pretty ambiguous about that in my question, sorry.)
Nicholas Riley
+2  A: 

PostgreSQL supports a full outer join which can simplify the query:

  select v.text_value as volume_value, i.text_value as issue_value
      from ( select item_id, text_value
               from metadatavalue
              where metadata_field_id = 90) v
           full join
           ( select item_id, text_value
               from metadatavalue
              where metadata_field_id = 91) i
           using (item_id)
Paul Morgan
Thanks! Exactly what I was looking for.
Nicholas Riley
Can't believe it didn't come to my mind instead of these ugly subqueries. This will use an efficient MERGE JOIN, provided there's an index on (metadata_field_id, item_id). Definitely a +1.
Quassnoi
Thanks for the corrections and +1. I'm not that familiar with PostgresSQL. I noticed you didn't include item_id in the results but if you wanted item_id you'd have to COALESCE the item_ids from the two tables.
Paul Morgan