views:

93

answers:

5

I had trouble coming up with a title for this question because I'm not really sure how to do what I want, so I'm not sure if "joining" is the best terminology in this instance.

I have two tables in my database, Plans and Assumptions that look like the following:

Plans Table

Plankey  ass1 ass2 ass3 ass4
aplan    0    6    5    7
bplan    2    0    7    4

Assumptions Table

assType refKey assName
"ass1"  0      "gender factors a"
"ass1"  2      "gender factors b"
"ass2"  0      "age factors a"
"ass2"  6      "age factors b"
"ass3"  5      "inflation factors a"
"ass3"  7      "inflation factors b"
"ass4"  4      "tax factors a"
"ass4"  7      "tax factors b"

I need to make a query (or set of queries and sub-queries) that gives me the names of the assumptions in use for each plan:

Plankey  assName1           assName2         assName3               assName4
aplan    "gender factors a" "age factors b"  "inflation factors a"  "tax factors b"
bplan    "gender factors b" "age factors a"  "inflation factors b"  "tax factors a"

Yeah... I know. assName. Also, even if this isn't the best design, that is beyond my control. I'm just trying to query an existing set of data.

I should also mention there are over 500 assumption types (ass1,ass2,...,ass500,etc) and each type could have over 100 assumption refKey/Names per assumption type.

I'm trying to wrap my head around this and it seems easy, but I just can't figure out how to do it. Any ideas? Maybe there is a concept I'm missing because I just haven't encountered it yet. I'm okay with hardcoding the column names assName1, assName2 etc into my query, but even then I'm unsure how to "lookup" the assNames from the Assumptions table when it seems like I'm looking up from the same table for multiple columns in my result.

EDIT: I ommitted something really important. refkey is re-used in the Assumptions Table. So an assName is uniquely determined by the combination of assType and refKey. I apologize for not making that clear in my example! I forgot about that until I looked at the answers. I have changed my example to reflect this as well.

EDIT2: I am using MS SQL Server.

EDIT3: I expect to find match in the assumptions table for every plan. If not, I would have bigger problems - unrelated to this question though.

+1  A: 
SELECT p.PlanKey, a1.name, a2.name, a3.name, a4.name
FROM Plans AS p
LEFT JOIN Assumptions AS a1 ON p.ass1 = a1.refKey
LEFT JOIN Assumptions AS a2 ON p.ass2 = a2.refKey
LEFT JOIN Assumptions AS a3 ON p.ass3 = a3.refKey
LEFT JOIN Assumptions AS a4 ON p.ass4 = a4.refKey
Mchl
That assumes there's a supporting record for each column
OMG Ponies
+1 @OMG Ponies - you need a left join not an inner
JNK
yes, I did that assumption.
Mchl
@OMG Ponies. sorry. can you explain what you mean by "supporting record"?
oob
@sqlnoob: he means that the original query, using INNER joins, assumed there was always a matching record in both the Plans and Assumptions table. If you use an INNER join, you will only get rows where there is a record in both. LEFT join shows you all records on the left (first) table regardless of matching record in the right table.
JNK
@sqlnoob: ...what JNK said. `PLANS` is the parent table in this relationship; `ASSUMPTIONS` is the child (multiple times over for this example). A query using an INNER JOIN will only return a record if the relationship exists - Mchl's query was correct based on the example data, but the safe assumption (pun intended) is that this isn't the case - so an OUTER join is safer until we know your data better.
OMG Ponies
ohhhhh ok. yeah i knew about the difference between inner and left, but i was just caught up on "supporting record". I will add this to my question. If there isn't a "supporting record", I would have a bigger, unrelated problem, so my assumption is that there will be a match for every record. In that case, would "INNER" be more efficient than "LEFT"?
oob
INNER generally performs faster, but more important is accuracy. It isn't just a case of there being a match for every row - there would have to be a match for every column in every row. Does every plan have exactly 500 assumptions?
Tom H.
@sqlnoob: Use LEFT. If you expect a matching record for every row, you will get the same result, but you will get a NULL if you are missing one. If you use an INNER it just won't return the unmatched records and you will be none the wiser.
JNK
@Tom H. yes every plan has exactly N assumptions, meaning every plan as the same number of assumptions (It is actually somewhere between 500 and 1000)
oob
@JNK. ok cool. I would have to check the results either way. I guess it would then come down to checking against the number of expected records or just looking for nulls.
oob
+6  A: 

You have to LEFT JOIN onto the ASSUMPTIONS table for every ass# column in the PLANS table:

   SELECT p.plankey,
          a1.assname,
          a2.assname,
          a3.assname,
          a4.assname
     FROM PLANS p
LEFT JOIN ASSUMPTIONS a1 ON a1.refkey = p.ass1
                        AND a1.asstype = 'ass1'
LEFT JOIN ASSUMPTIONS a2 ON a2.refkey = p.ass2
                        AND a2.asstype = 'ass2'
LEFT JOIN ASSUMPTIONS a3 ON a3.refkey = p.ass3
                        AND a3.asstype = 'ass3'
LEFT JOIN ASSUMPTIONS a4 ON a4.refkey = p.ass4
                        AND a4.asstype = 'ass4'

Without knowing the database, I can't provide the syntax for dynamic SQL to construct the query for a varying number of joins that need to be performed.

OMG Ponies
oob
@sqlnoob: Updated, did I understand your correction properly?
OMG Ponies
yup! thanks. that is probably much faster than the text concatenation i proposed. duh.
oob
@sqlnoob: Text *concatenation*?! *Ewwwww*... :)
OMG Ponies
@OMG Ponies - whoosh!
oob
@sqlnoob: Joining on text concatenation - bad, very bad. Avoid if at all possible cuz it can't use an index.
OMG Ponies
@this solution worked for me. only thing i added was naming the columns of the results using aliases: a1.assname as assname1,a2.assname as assname2,etc (in the select portion of the query)
oob
@sqlnoob: Column aliases cost extra ;)
OMG Ponies
@OMG Ponies - oh okay. i thought maybe, just maybe you were making some pun with the word concatenation and i didn't get it. ;)
oob
+1  A: 

This would be pretty straightforward in a programming language. You would read the entire Assumptions table, then read the Plans table, substituting the assName for the four refKeys in each Plans row.

OMG Ponies has the most popular SQL answer at the time I'm editing this answer.

Gilbert Le Blanc
straightforward maybe, but using correlated subqueries which are a problem for MySQL for example
Mchl
It appears that Daniel Vassallo deleted his answer. I'm editing my answer to reflect this.
Gilbert Le Blanc
+1  A: 

The best way to deal with this would be to turn your assName* columns into rows in a sub-query or a view on that second table.

If you're using SQL Server, you can use UNPIVOT to do it. I'm not sure what other DBMSs support UNPIVOT though, so let us know which one you're using. If you are using SQL Server I'll provide an example.

http://msdn.microsoft.com/en-us/library/ms177410.aspx

UPDATE:

Here's a query that will transform the Plans table into a form that is easy to join to. However, I have overlooked the fact that the result you want is still 'pivoted', so it's not going to be much use unless you can accept the query results in a more 'normalized' form. For the 'pivoted' result, other suggestions are much better.

Here's the query anyway, for anyone that's interested:

SELECT [PVT].[PlanKey], [PVT].[RefKey], [PVT].[Ass]
FROM   [Plans] [P]
       UNPIVOT
       ([RefKey] FOR [Ass] IN ([ass1], [ass2], [ass3], [ass4])) [PVT]

Apologies if it's not of any use - my mistake.

Alex Humphrey
Thanks. I definitely did not know about unpivot. I didn't use it this time around, but I'm glad I know about it now!
oob
+1  A: 

You could try normalizing the Plans table as an inline query and then joining to that. I don't know which would be worse - 500 unions or 500 joins.

SELECT
    SQ.Plankey,
    A.assName
FROM
(

    SELECT Plankey, ass1 AS assRefKey, 'ass1' AS assType FROM Plans WHERE ass1 IS NOT NULL UNION ALL
    SELECT Plankey, ass2, 'ass2' AS assType FROM Plans WHERE ass2 IS NOT NULL UNION ALL
    ...
    SELECT Plankey, ass500, 'ass500' AS assType FROM Plans WHERE ass500 IS NOT NULL
) SQ
INNER JOIN Assumptions A ON
    A.refKey = SQ.assRefKey AND
    A.assType = SQ.assType

The resultset won't be in the format that you requested, but it might be better for some uses of the data. Either way, that's one ugly database design. You'd think that by the time they got to ass100 that they would have realized that they might be doing something wrong.

If you're going to use this approach, you might want to just create a view for that inner query and query against the view.

Tom H.