views:

316

answers:

7

I have two tables

(1) MonthlyTarget {SalesManCode, TargetMonthYear, TargetValue}; this table has 1966177 rows.

(2) MonthlySales  {SalesManCode, SaleDate, AchievedValue};

this table has 400310 rows.

I have to make a query that produces a result like the following table:

{SalesManCode, JanTar, JanAch, FebTar, FebAch,....., DecTar, DecAch}

The problem is, joining these two tables taking a long time.

What should be the query?

How can the query be optimized?

I don't want to consider indexing.

+2  A: 

Check for proper indexing on the tables. It is almost impossible to say without looking at the database itself but 99% of the time slow joins are due to improper or missing table indices.

Andrew Hare
A: 

Do you have any filters? Could you store some partial results in a temp-table, then join to the rest of your data after you've reduced the size of your data?

Jay Mooney
+3  A: 

Well, if you don't want to consider indexing, then you will always be preforming full table scans and performance will not be improved.

Lost in Alabama
A: 

If indexing isn't an option, the only other way to speed it up is putting it on a faster server. Something tells me indexing would be easier though.

Justin Balvanz
+1  A: 

I don't want to consider indexing.

You have to consider indexing. It doesn't matter how you write the query, the db engine has no choice but to scan each table looking for the joins and it will likely do that over and over again. You have no choice.

If you don't want to mess with indexing because you don't have control over the database, consider exporting the data to a local SQL Express instance.

Heck, even exporting the data to flat files, sorting the files by SalesManCode and writing a simple program to read and match them would be faster.

ongle
+1  A: 

Short of firing a bunch of sales people, consider the following options:

  • create a batch process to run this nightly and populate some reporting tables with the output of your query
  • create an indexed/materialized view of the query you want (although you will need to create an index on the indexed view, so maybe this breaks your no-indexes rule)
RedFilter
+3  A: 

It looks like you're missing some columns in your MonthlyTarget table, namely a "TargetDate" column.

In addition to what everyone has already said about indexing, sometimes a divide-and-conquer approach can really help. Rather than joining a 1966177 row table to a 400310 row table, create to tiny temp tables and join them together instead:

CREATE TABLE #MonthlySalesAgg
(
    SalesManCode int,
    JanTar money,
    FebTar money,
    MarTar money,
    AprTar money,
    MayTar money,
    JunTar money,
    JulTar money,
    AugTar money,
    SepTar money,
    OctTar money,
    NovTar money,
    DecTar money

    PRIMARY KEY CLUSTERED (SalesManCode)
)

INSERT INTO #MonthlySalesAgg
SELECT *
FROM
(SELECT SalesManCode, TargetValue, SaleMonth = Month(TargetDate) FROM MonthlyTarget) as temp
PIVOT
(
    Max(TargetValue)
    FOR [SaleMonth] IN ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12])
) as p

CREATE TABLE #MonthlyTargetAgg
(
    SalesManCode int,
    JanAch money,
    FebAch money,
    MarAch money,
    AprAch money,
    MayAch money,
    JunAch money,
    JulAch money,
    AugAch money,
    SepAch money,
    OctAch money,
    NovAch money,
    DecAch money

    PRIMARY KEY CLUSTERED (SalesManCode)
)

INSERT INTO #MonthlyTargetAgg
SELECT * FROM
(SELECT SalesManCode, AchievedValue, SaleMonth = Month(SaleDate) FROM MonthlySales) as temp
PIVOT
(
    Sum(AchievedValue)
    FOR [SaleMonth] IN ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12])
) as p

The queries above create two intermediate tables which should contain the same number of records as your SalesMan table. Joining them is straightforward:

SELECT *
FROM #MonthlyTargetAgg target
INNER JOIN #MonthlySalesAgg sales ON target.SalesManCode = sales.SalesManCode

If you find yourself needing to pull out data by month all the time, move the code into a view instead.

PIVOT requires SQL Server 2005 or higher, and its often a very userful operator. Hopefully SQL Server 2009 will allow users to pivot on more than one column at a time, which will result in an even simpler query than shown above.


Using SQL Server 2000:

PIVOT is syntax sugar. For example,

SELECT * FROM
(SELECT SalesManCode, AchievedValue, SaleMonth = Month(SaleDate) FROM MonthlySales) as temp
PIVOT
(
    Sum(AchievedValue)
    FOR [SaleMonth] IN ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12])
) as p

Becomes

SELECT
    SalesManCode,
    [1] = Sum(case SaleMonth when 1 then AchievedValue else 0 end),
    [2] = Sum(case SaleMonth when 2 then AchievedValue else 0 end),
    [3] = Sum(case SaleMonth when 3 then AchievedValue else 0 end),
    [4] = Sum(case SaleMonth when 4 then AchievedValue else 0 end),
    [5] = Sum(case SaleMonth when 5 then AchievedValue else 0 end),
    [6] = Sum(case SaleMonth when 6 then AchievedValue else 0 end),
    [7] = Sum(case SaleMonth when 7 then AchievedValue else 0 end),
    [8] = Sum(case SaleMonth when 8 then AchievedValue else 0 end),
    [9] = Sum(case SaleMonth when 9 then AchievedValue else 0 end),
    [10] = Sum(case SaleMonth when 10 then AchievedValue else 0 end),
    [11] = Sum(case SaleMonth when 11 then AchievedValue else 0 end),
    [12] = Sum(case SaleMonth when 12 then AchievedValue else 0 end)
FROM
    (SELECT SalesManCode, AchievedValue, SaleMonth = Month(SaleDate) FROM MonthlySales) as temp
GROUP BY SalesManCode
Juliet
How can I achieve this in SQL Server 2000?
I've updated my code with a translation of PIVOT to something that works in SQL Server 2000.
Juliet
+1 - good explanation of implementing a PIVOT operation and the real world relevance. Nicely done.
KG