views:

721

answers:

2

Here is the code to help you understand my question:

create table con ( content_id number);
create table mat ( material_id number, content_id number, resolution number, file_location varchar2(50), file_size number);
create table con_groups (content_group_id number, content_id number);

insert into con values (99);
insert into mat values (1, 99, 7, 'C:\foo.jpg', 1024);
insert into mat values (2, 99, 2, '\\server\xyz.mov', 350000);
insert into mat values (3, 99, 5, '\\server2\xyz.wav', 175000);
insert into con values (100);
insert into mat values (4, 100, 5, 'C:\bar.png', 2048);
insert into mat values (5, 100, 3, '\\server\xyz.mov', 27400);
insert into mat values (6, 100, 7, '\\server2\xyz.wav', 400);

insert into con_groups values (10, 99);
insert into con_groups values (10, 100);

SELECT m.material_id, 
       m.content_id,
       (SELECT max(file_location) keep (dense_rank first order by resolution desc)                  
          FROM mat
         WHERE mat.content_id = m.content_id
        /* AND ...
           AND ...
           AND ... */) special_mat_file_location,
       (SELECT max(file_size) keep (dense_rank first order by resolution desc)
          FROM mat
         WHERE mat.content_id = m.content_id
        /* AND ...
           AND ...
           AND ... */) special_mat_file_size
  FROM mat m
 WHERE m.material_id IN (select material_id
                           from mat
                     inner join con on con.content_id = mat.content_id
                     inner join con_groups on con_groups.content_id = con.content_id
                          where con_groups.content_group_id = 10);

I put the commented ANDs to highlight that this is a simplified example; the subquery in my real query is more complex with more criteria.

My problem is: I want to avoid repeating all of the criteria in the subquery for both columns (file_location and file_size) because the criteria is exactly the same. I would gladly use Common Table Expressions (i.e. subquery factoring using the WITH clause) but I can't because of the "WHERE mat.content_id = m.content_id" in the subquery, which makes it a correlated subquery. It is my understanding that one cannot factor correlated subqueries using the WITH clause. For that same reason I also cannot put this subquery as an inline view (aka derived table) in the FROM clause.

How can I include the criteria once and inject more than one column into the resultset with a correlated subquery?

+1  A: 

Use Subquery Factoring (that's what Oracle calls it - it's called a Common Table Expression in SQL Server). Supported 9i+:

WITH file AS (
  SELECT t.content_id,
         MAX(t.file_location) keep (DENSE_RANK t.first ORDER BY t.resolution DESC) 'fileLocation',
         MAX(t.file_size) keep (DENSE_RANK t.first ORDER BY t.resolution DESC) 'fileSize'
    FROM mat t
GROUP BY t.content_id)
SELECT m.material_id, 
       m.content_id,
       f.fileLocation,
       f.fileSize
  FROM mat m
  JOIN file f ON f.content_id = m.content_id

It's intended for inline view re-use. You generalize the view, and define specific filteration for differing instances in the JOIN clause. You need to expose columns for joining in the SELECT clause - see the content_id as an example.

A correlated subquery means it can be re-written as a JOIN - the correlation is the JOIN criteria.

You can define multiple views in Subquery Factoring - if you provided more detail I could tailor the answer better.

OMG Ponies
Can anyone explain why this is better that simple join in FROM clause? Or even putting subquery logic in a separate view?
jva
Re: "A correlated subquery means it can be re-written as a JOIN" --That's an interesting thought. I'm having a bit of trouble with your query though. It said file was an invalid table name, so I renamed it. Then it didn't like how the aggregate columns were named, so I changed that. Now I'm getting the message "ORA-00937: not a single-group group function" for content_id. Am I totally misunderstanding something?
Nate
@jva: Google "subquery factoring".
OMG Ponies
@Nate: I forgot the "GROUP BY t.content_id" in the subquery factoring.
OMG Ponies
@Nate: I updated the query to include the GROUP BY that was missing - try it now.
OMG Ponies
I had to rename the factored subquery and remove the single quotes again, but...yeah that seemed to do the trick; the results of the query are correct! Thanks!
Nate
Odd about the single quotes. Mind marking my answer as having answered the question?
OMG Ponies
A: 

I think I got it. I was able to just grab the primary key of the "special" material and wrap the main query in another layer, like so:

         SELECT x.material_id,
                x.content_id,
                sp_m.file_location,
                sp_m.file_size
           FROM (SELECT m.material_id, 
                        m.content_id,
                        (SELECT max(material_id) keep (dense_rank first order by resolution desc)                  
                           FROM mat
                          WHERE mat.content_id = m.content_id
                         /* AND ...
                            AND ...
                            AND ... */) special_mat_primary_key
                   FROM mat m
                  WHERE m.material_id IN (select material_id
                                            from mat
                                      inner join con on con.content_id = mat.content_id
                                      inner join con_groups on con_groups.content_id = con.content_id
                                           where con_groups.content_group_id = 10)) x
LEFT OUTER JOIN mat sp_m ON sp_m.material_id = x.special_mat_primary_key;

Let me know what you think.

Nate
A correlated subselect means it will execute for every row returned. The IN clause is the next worst performing way to query things. You're creating an inline view with more information than you need to.
OMG Ponies