views:

942

answers:

4

I'm working with a medical-record system that stores data in a construct that resembles a spreadsheet--date/time in column headers, measurements (e.g. physician name, Rh, blood type) in first column of each row, and a value in the intersecting cell. Reports that are based on this construct often require 10 or more of these measures to be displayed.

For reporting purposes, the dataset needs to have one row for each patient, the date/time the measurement was taken, and a column for each measurement. In essence, one needs to pivot the construct by 90 degrees.

At one point, I actually used SQL Server's PIVOT functionality to do just that. For a variety of reasons, it became apparent that this approach wouldn't work. I decided that I would use an inline view (IV) to massage the data into the desired format. The simplified query resembles:

SELECT patient_id, 
       datetime, 
       m1.value AS physician_name, 
       m2.value AS blood_type, 
       m3.value AS rh
  FROM patient_table
INNER JOIN ( complex query here
              WHERE measure_id=1) m1...
INNER JOIN (complex query here
              WHERE measure_id=2) m2...
LEFT OUTER JOIN (complex query here
                 WHERE measure_id=3) m3...

As you can see, in some cases these IVs are used to restrict the resulting dataset (INNER JOIN), in other cases they do not restrict the dataset (LEFT OUTER JOIN). However, the 'complex query' part is essentially the same for each of these measure, except for the difference in measure_id. While this approach works, it leads to fairly large SQL statements, limits reuse, and exposes the query to errors.

My thought was to replace the 'complex query' and WHERE clause with a Inline Table-Value UDF. This would simplify the queries quite a bit, reduce errors, and increase code reuse. The only question on my mind is performance. Will the UDF approach lead to significant decreases in performance? Might it improve matters?

Thanks for your time and consideration.

+1  A: 

You also have a third option; a traditional VIEW (assuming that you have a key to join to). In theory, there shouldn't be a performance difference between the three options because SQL Server should evaluate and optimize the plans accordingly. The reality is that sometimes that doesn't happen as well as we'd like.

The benefit of a traditional view is that you could make it an indexed view, and give SQL Server another performance aid; however, you'll just have to test and see.

Stuart Ainsworth
Traditional views won't work at this client as the DBA is resistant to having too may non-vendor-defined objects in the DB. As a result, most of the queries are embedded in each report (we are using CR).The reason I thought about UDF is their modularity. The hard part is selling the DBA on the idea...Is an indexed view similar to Oracle's materialized view?
Craig
I haven't worked with Oracle in a long time, so I don't know. However, I would point out that a UDF is a non-vendor-defined object, so you may run into the same pushback there. I believe astander's answer with a temp table or variable may be your best bet if you're limited to modifying objects.Just as an aside, I tend to create a secondary db with cross-linked tables, etc, when dealing with vendor dbs. This way I can have custom code outside of their database; you may want to explore that as an option in the future.
Stuart Ainsworth
Indexed views are the same as Oracles materialized view. The view must conform to certain restrictions: http://msdn.microsoft.com/en-us/library/ms191432.aspx and only Enterprise Edition will consider using the indexed view for data access. Standard Edition and lower editions must explicitly force the use of the iindexed view by adding the NOEXPAND hint, see http://msdn.microsoft.com/en-us/library/ms181151.aspx
Remus Rusanu
You are right, it is also a NVDO. I figured that I might be able to get approval for a few of these rather than a larger number of view. We'll see...Can I use temp table or variables in 'straight' SQL? If so, that is news to me. I've always used them in SPs and UDFs.A secondary DB is also a good idea. Obviously, it would contain the custom objects, but how do you access the NVD objects?
Craig
@Remus: Thanks, I'll keep that in mind.
Craig
Do you mean the vendor objects? Connect to the custom database like you normally do, and then do something like SELECT * FROM VendorDB.dbo.Table?Did I misunderstand your question?
Stuart Ainsworth
+1  A: 

Sql Server 2005 answer: You can reduce the inline view by using temp/var tables. Performace issues on these are the temp inserts you require per hit on the query, but if the result sets are small enough, they can help. you can use primary keys on var tables, and primary keys/ indexes on temp tables. Other than normal belive, i have found a couple of articles indicating that both temp/var tables are stored in the temp db.

UDF functions, we have found to be less performant, when you have multi layer udfs in complex queries, but will maintain usability. Be sure to create the function correctly for the various conditions specified. Those that WILL be used for inner joins, and those that will be used for left joins.

So, in general. We do use UDFs, but when we find that the performance degrade, we move the query to insert UDF selections into temp/var tables and join on those.

Create functionality for ease of use/maintinance, and apply performance inhancements where and when required.

EDIT:

If you are required to run this for crystal, and you plan to use Stored procedures, Yes, you can execute sql statements inside the SP to temp/var tables.

Let me know if you are going to use SPs. Sql will then also cache the sp plans with given params as requied.

Also from previous experiance with crystal, things to avoid, is grouping in Crystal that can be done in the SP, page numbers if not required. and function calls, if this can be handled on the server.

astander
Pardon my ignorance, but can I use a temp/var table in a standard SQL statement? It needs run from within a reporting tool (Crystal Reports). I'm guess your solution requires a SP, right? If not, could you expand on this approach a bit more?We are using SQL Server 2005.
Craig
table vars can be added to sql executed against the DB. Are you using the sql from the app, or in Crystal?
astander
From within Crystal Reports' Command object--essentially a raw-SQL statement. I've noticed that the database drive (i.e. ODBC or OLEDB) has an effect on the SQL's functionality. For example, I've had issues in the past using the WITH statement with OLEDB, but not with ODBC. This may affect my ability to use more advanced techniques.
Craig
Stored procedures are going to be a tough sell to the DBA.
Craig
Remote DB Calls can be expesive. Raw Crystal calls can be costly. Your DBA is leaving you with very little, how about them maning the phones for a day X-) joke. K, is the database a Datawarehouse? Does the data get updated regularly, or maybe once a while, Hourly, Daily?
astander
It's a nightly extract from a Cache system. It's a database and a poorly-designed on at that. I've been pushing for additional ETL logic to be generated after the Cache extract, but no decisions have been made so far. As a result, most report use raw, complex, and costly SQL (minimal indexing).
Craig
Then i might say, until you get the approval, see if you can design your own db, import their data, after they import theirs, and design it the way you require. Create the DB with their base BAD tables. import the data straight, and manipulate into your structure and then work from there. we had to do this lately, with aministrators of third party vendors.
astander
+3  A: 

A correctly defined TVF will not introduce any problem. You'll find many claims on the interned blasting TVFs for performance problems as compared to views or temp tables and variables. What is usualy not understood is that a TVF behaves differently from a view. A View definition is placed into the original query and then the optimizer wil rearrange the query tree as it sees fit (unless the NOEXPAND clause is used on indexed views). A TVF has different semantics and sometimes, specially when updating data, this results in the TVF output being spooled for haloween protection. It helps to mark the function WITH SCHEMABINDING, see Improving query plans with the SCHEMABINDING option on T-SQL UDFs.

Alsois important to understant the concepts of deterministic and precise function. Although they apply mostly to scalar value funcitons, TVFs can be also affected. See User-Defined Function Design Guidelines.

Remus Rusanu
Wow, quite a few subtle nuances. Thanks for your insight.
Craig
+1  A: 

Since you need a SQL String and may not have the ability to add a view or UDF to the system, you may want to use WITH ... AS to limit the complex query to one place (At least for this statement.).

WITH complex(patientid, datetime, measure_id, value) AS
(Select... Complex Query)
SELECT patient_id
,        datetime
,        m1.value AS physician_name
,        m2.value AS blood_type
,        m3.value AS rh  
FROM patient_table
INNER JOIN (Select ,,,, From complex WHERE measure_id=1) m1...
INNER JOIN (Select ,,,, From complex WHERE measure_id=2) m2...
LEFT OUTER JOIN (Select ,,,, From complex WHERE measure_id=3) m3...
Jeff O
Assuming that the 'complex query' returns 1,000,000 rows without a WHERE clause (it doesn't, but it helps to illustrate my question). Will the DB generate the 1,000,000 rows once, the select from it for each join (three in my example). Or does it do a better job of optimization than this?I don't suppose that I can add INDEX hints to it? I know that the underlying tables are not indexed well and our DB will not add any non-vendor-defined indicies.
Craig
The common table expression (complex) will be created once and in this case called multiple times. SQL server probably would optimize calling he same query multiple times, but a very complex query may not. Your DBA should provide some assistance in testing your query for performance.
Jeff O