views:

130

answers:

3

There are some constraints to this question; I don't have the ability to fundamentally alter any database structure.

The challenge here is that I have rows in a database that contain information that really should be on its own row. A somewhat simplified example of the column structure:

[PersonID] [FirstName] [LastName] [FirstNameGuest1] [LastNameGuest1]
 1          Ringo       Starr      John              Lennon
 2          George      Harrison   Paul              McCartney

I need to split these out like this in order to be able to run the reports I need:

[PersonID] [FirstName] [LastName]
 1          Ringo       Starr
 1          John        Lennon
 2          George      Harrison
 2          Paul        McCartney

Since I'm using this to produce a view, I've had to just reference the same table for each group of guest columns and use UNION ALLs to tie them all together.

However, since then I've had to build queries of ever-increasing complexity on top of the derived view. Each layer of complexity causes the results to return more and more slowly.

Am I taking a fundamentally bad approach? Are there other, more correct ways to model the data in the way that I need?

Here's part of the actual query so you can see what I'm dealing with:

--Primary Record
SELECT
 'Franchisee' AS 'Type', 
 Confirmation AS 'BelongingTo', 
 0 AS 'GuestNo', 
 FirstName, 
 LastName, 
 FF_27557_152972 AS 'HotelChoice', 
 HotelCheckIn, 
 HotelCheckOut, 
 HotelSmoking AS 'Smoking', 
 (CASE FF_27554_1 WHEN 'Yes' THEN 1 ELSE 0 END) AS 'PrimaryRoomHolder', 
 '' AS 'SharingWith', 
 'None' AS 'SharingWithName'
FROM dbo.[Table]
WHERE Type = 'Production' AND Submitted = 1 AND Cancelled = 0 AND Label = 'Primary'

UNION ALL
-- First Guest
SELECT
'Guest' AS 'Type',
Confirmation AS 'BelongingTo',
1 AS 'GuestNo', 
FF_27637_1 AS 'FirstName', 
FF_27637_152806 AS 'LastName', 
FF_27637_152822 AS 'HotelChoice', 
FF_27637_152813 AS 'HotelCheckIn', 
FF_27637_152821 AS 'HotelCheckOut', 
FF_27637_152824 AS 'Smoking', 
(CASE WHEN FF_27637_152822 IS NOT NULL THEN 1 ELSE 0 END) AS 'PrimaryRoomHolder', 
FF_27637_154245 AS 'SharingWith', 
(CASE CAST(FF_27637_154245 AS integer) 
  WHEN 0 THEN FirstName + ' ' + LastName 
  WHEN 1 THEN FF_27637_1 + ' ' + FF_27637_152806 
  WHEN 2 THEN FF_27742_1 + ' ' + FF_27742_153577 
  WHEN 3 THEN FF_27638_1 + ' ' + FF_27638_152814 
  WHEN 4 THEN FF_27639_1 + ' ' + FF_27639_152817 
  WHEN 5 THEN FF_27640_1 + ' ' + FF_27640_152852
  WHEN 6 THEN FF_27641_1 + ' ' + FF_27641_152860 
  WHEN 7 THEN FF_27642_1 + ' ' + FF_27642_152868 
  WHEN 8 THEN FF_27643_1 + ' ' + FF_27643_152877
  WHEN 9 THEN FF_27644_1 + ' ' + FF_27644_152885 
  WHEN 10 THEN FF_27645_1 + ' ' + FF_27645_152893 
  ELSE 'None' END) AS 'SharingWithName'
FROM dbo.Event_213_1546 AS Event_213_1546_10
WHERE Type = 'Production' AND Submitted = 1 AND Cancelled = 0 AND Label = 'Primary' AND FF_27637_1 IS NOT NULL
.
.
.
(Iterates through 9 more guests exactly like "First Guest")
+2  A: 

Depends on what you are trying to do with the view.
The view query has to execute before whatever operations you perform on it. So if you are doing things that only deal with 10% of the results returned by the view, there's a waste of processing. It's likely the operations performed could be done at the same time as the logic already in the view.

Can you add indexes if necessary? That would help...

A materialized view, called an indexed view in SQL Server terminology, is a viable possibility but notoriously unaccommodating.

OMG Ponies
Technically, views do not necessarily "execute before" other operations. In fact, I believe views are expanded in the context of an enclosing query and then the whole business gets optimized as one plan. Exceptions would be when using the "noexpand" hint.
onupdatecascade
@onupdatecascade: In order for operations to occur on the resultset provided by a view, the view has to execute first. Effectively, a subquery/derived table. `SELECT LEFT(...) FROM v_view` has to run the view query first - the view query is not interrupted to add the additional criteria.
OMG Ponies
Curious, I've run select statements on views that run faster if I omit a choice column from the view (This column is based on a poorly designed UDF.). If the view is required to run first, they would perform the same?
Jeff O
It's plausible that the query runs faster because you omitted the column afterwards - the query plan would already exist. The first time would have been the hard parse, subsequent ones are soft parses.
OMG Ponies
@OMGPonies - it depends on the content of the view, and the rules for optimization. If there's a function in there (like your left() example) then it might be necessary to process all the rows in the view, but that's because of the function, not the presence of the view. The rules for optimization would be the same if you copied and pasted the view definition into the outer query as a derived table.
onupdatecascade
See: http://technet.microsoft.com/en-us/library/cc917715.aspx "Using the NOEXPAND view hint" This can be proven with some query plan examination
onupdatecascade
+1  A: 

Have you considered the UNPIVOT operator? It does the same thing, but might be less painful (maybe! :-). Requires 2005 or later.

http://blogs.msdn.com/craigfr/archive/2007/07/17/the-unpivot-operator.aspx

Example:

declare @names table ( personid int, 
    firstname1 varchar(50), 
    lastname1 varchar(50),
    firstname2 varchar(50), 
    lastname2 varchar(50),
    firstname3 varchar(50), 
    lastname3 varchar(50)

    -- <etc.>
    )

Insert @names values ( 1, 'Fred', 'Flintstone', 'Barney', 'Rubble', 'Wilma', 'Flintstone' )
Insert @names values ( 2, 'Super', 'Man', 'Aqua', 'Man', 'Wonder', 'Woman' )

select * from @names

select personid, firstnamecol, firstname, lastnamecol, lastname
from @names
unpivot( firstname for firstnamecol in ( firstname1, firstname2, firstname3 ) ) firstnames
unpivot( lastname for lastnamecol in ( lastname1, lastname2, lastname3 ) ) lastnames
where right(firstnamecol, 1) = right( lastnamecol, 1 ) -- This is the tricky bit

Using multiple unpivots in one select is tricky; I got the above notion from these guys:

http://weblogs.sqlteam.com/jeffs/archive/2008/04/23/unpivot.aspx http://mangalpardeshi.blogspot.com/2009/04/unpivot-multiple-columns.html

It's important to note that the part marked "tricky" will break with varying numbers of digits in the names of the repeating columns (e.g. FirstName11 and FirstName1 will be a problem). You could work around that with some substring() tricks. Might not be the best idea, but...

Addition, Feb 22: This is a PHENOMENAL article on unpivot: http://bradsruminations.blogspot.com/2010/02/spotlight-on-unpivot-part-1.html

onupdatecascade
Wow, I didn't know about the unpivot operator. Conceptually, this is exactly what I need.
DanielMason
Just be careful - I don't know what sort of performance impact you can expect. Could be terrible :-)
onupdatecascade
A: 
DanielMason