SQL Server 2008 + expensive union all

views:

answers:

SQL Server 2008 + expensive union all

Hi al,

we have 5 tables over which we should query with user search input throughout a stored procedure. We do a union all of the similar data inside a view. Because of this the view can not be materialized. We are not able to change these 5 tables drastically (like creating a 6th table that contains the similar data of the 5 tables and reference that new one from the 5 tables).

The query is rather expensive / slow

what are our other options? It's allowed to think outside the box. Unfortunately I cannot give more information like the table/view/SP definition because of customer confidentiality...

greetings, Tim

Try changing the UNION to UNION ALL

UNION is more expensive than UNION ALL because it will remove duplicates

Also do you have indexes on those columns and are the queries that you are writing SARGable, if you for example have to look in the middle of a column then indexes won't be used

SQLMenace 2010-06-03 15:51:17

sorry, we already do that. I will change my question.

Tim Mahy 2010-06-03 15:59:21

You need to classify your tables in order of importance and level. You start with your basic table and you build on top of that.

Depending on the size of the other tables, and they information they pertain you join them differently. The order in which you join also matters.

If you could please post a little more about your tables, what they contain, how they are used etc, I would be able to give u an example.

More specifically, do all 5 tables contain the same information, and u join them, to query a 'super set' of data? or are they 5 tables with various attributes that together make a resultset that you need?

Theofanis Pantelides 2010-06-03 16:13:33

It all depends a bit on the way you query the view, and what kind of data is in there => do the 5 tables contain 'separate' data, or is there overlap that needs to be removed again from the end-result ? (if there is no overlap, make sure to use UNION ALL)

Assuming you query the view using WHERE statements :

Assuming the queries are 'predictable', make sure to have proper indexes
If there is no overlap between the tables, are there fields that clearly distinguish what data comes from which table ? You might want to add CHECK constraints on said tables and try to make the view act like an "old-school" partitioned view. (see here)

And as always, make sure statistics are up-to-date, run a manual update after big (or many) changes to a table (simply running sp_updatestats now end then never hurts IMHO) !

deroby 2010-06-03 20:10:49

UNION ALL is literally one of the cheapest operations that exists in SQL Server. It simply takes the result sets from each candidate query and concatenates the results. UNION is different, it needs to perform a DISTINCT sort, but if you are using UNION ALL then I can guarantee you that the concatenation itself is not the problem.

There are a few reasons why your query might be performing poorly:

The underlying queries being concatenated are inefficient, due to poor indexing or other factors. Post the execution plan.
You're applying the UNION ALL before the predicates, i.e. assuming that UNION ALL obeys the distributive law (it doesn't). This turns a simple post-concatenation into a table concatenation and very expensive table scan.
The plan itself is not slow, but you are selecting a huge number of results. If so, then there's not much you can do other than to try to limit the result size (normalize?).

Edit - I just reread the question and noticed this line:

We do a union all of the similar data inside a view

The problem is definitely #2 - what you're essentially doing is concatenating the entire tables and then applying predicates, which destroys any chance the optimizer might have had of effectively using any indexes on the underlying tables. Every query will end up as 5 expensive table scans.

The only way you're going to solve this is by either materializing the view (which you say you can't do) or getting rid of it and substituting a stored procedure or UDF. You need to treat individual tables individually for best performance; concatenating them all together and trying to query against the concatenated results will always be (very) slow.

Aaronaught 2010-06-03 20:17:33

yes I want to materialize the view, but Sql Server does not allow it because of the union all :(

Tim Mahy 2010-06-03 21:12:04

@Tim: Not every problem can be solved by views. This is one of those cases where you'll need to query the underlying tables and possibly duplicate the `UNION ALL` logic in various queries, instead of just querying the view. When SQL Server doesn't let you materialize/index a view, that's a sure sign that you're headed for performance problems.

Aaronaught 2010-06-03 23:40:18

ansaurus

tags:

views:

answers:

SQL Server 2008 + expensive union all

related questions