views:

176

answers:

8
+1  Q: 

Optimizing unions

HI, im having trouble trying to optimize the following query for sql server 2005. Does anyone know how could i improve it. Each one of the tables used there have about 40 million rows each. I've tried my best trying to optimize it but i manage to do the exact opposite.

Thanks

SELECT
        cos
      , SIN
    FROM
        ConSisHis2005
    union all
    SELECT
        cos
      , SIN
    FROM
        ConSisHis2006
    union all
    SELECT
        cos
      , SIN
    FROM
        ConSisHis2007
    UNION ALL
    SELECT
        cos
      , SIN
    FROM
        ConSisHis2008

Maybe i should have said something else about the schema, all the tables used here are historical tables, they are not referenced to any other table. And theres already an index for cos and SIN. I was just wondering if there was any other way to optimize the query... as you can imagine 160millon records are hard to get :s

+1  A: 

Put a composite index on cos and sin on each of the tables. That's as good as you're going to get without restructuring the table design (in this example, it looks like you should have just 1 table to begin with)

Dave Markle
How would the indexes help since he's not filtering by anything.
Rowan
Since you're only selecting those two columns then SQL server can just get the data directly from composite index instead of having to hit the actual data record. Will only be an improvement if you have other columns in your tables.
Joseph Kingry
Oh yeah, didn't think of that.
Rowan
Thanks for the answer unfortunately the tables already have a composite index in the columns im fetching.
Alan FL
Then you really need to back up and ask yourself if the architecture you have selected for this table design/report logic is what you really want...
Dave Markle
+2  A: 

It seems that the query is just combining the separated history tables into a single result set containing all the data. In that case the query is already optimal.

Rowan
i was afraid someone would say that.. thank you.
Alan FL
+1  A: 

Since there is no WHERE clause, I don't believe there's anything you can do to improve the performance from this PoV.

You've correctly used UNION ALL so there's no help there.

The only other thing I can think of is whether there are more columns on the tables? If so, you might be fetching more data from disk than you need, thus slowing the query down.

cagcowboy
the tables have about 10 more columns, the table has a composite index on those to columns... it looks it won't get any better than thisthanks for the answer.
Alan FL
+2  A: 

Another approach would be to tackle the problem of why do you need to have all the 160 million rows? If you are doing some kind of reporting can you create separate reporting tables that already have some of the data aggregated. Or do you actually need a data warehouse to support your reporting needs.

Rowan
i need those 160 million rows for reporting purposes, but its quite clear i'll have to come up with a different approach. I was trying to avoid having to do this, but i guess i'll have to...getting those 160 million rows fast is hard :p thanks
Alan FL
+1  A: 

It might be worth experimenting with indexed views. You could put the above statement into a view with the indexes Dave suggested. This would take a little time to build initially but would return your results a little quicker (this is on the assumption that the data set does not change much and therefore you can live with the extra transactional overhead).

Chris Simpson
im going to try different indexes, but i don't think i'll experience any big improvements.thank you
Alan FL
A: 

do you have/need duplicate entry?

maybe you could filter that to get less rows?

Fredou
no, there are no duplicate records in the tables. Even if there would be duplicate tables, getting rid of them would be even more expensive
Alan FL
A: 

You might consider using a single partitioned table with a year indicator.

I'm still curious - is this code in a view or SP which operates on 160m rows or is it actually going to return 160m rows down the wire. If so, that's an awful lot of data to return that's effectively an extract and it's going to take a while just to come down the wire.

Cade Roux
A: 

There's no optimization to be done. Since you're selecting all the records from all the tables, by definition you get all the records from all the tables in one result set.

What's the reason for doing this?

le dorfier