views:

2609

answers:

8

I have data in a MSSQL table (TableB) where [dbo].tableB.myColumn changes format after a certain date...

I'm doing a simple Join to that table..

Select [dbo].tableB.theColumnINeed from [dbo].tableA 
left outer join [dbo].tableB on [dbo].tableA.myColumn = [dbo].tableB.myColumn

However, I need to join, using different formatting, based on a date column in Table A ([dbo].tableA.myDateColumn).

Something like...

Select [dbo].tableB.theColumnINeed from [dbo].tableA 
left outer join [dbo].tableB on [dbo].tableA.myColumn = 
    IF [dbo].tableA.myDateColumn > '1/1/2009'
        BEGIN
            FormatColumnOneWay([dbo].tableB.myColumn)
        END
    ELSE
        BEGIN
            FormatColumnAnotherWay([dbo].tableB.myColumn)
        END

I'm wondering if there's a way to do this.. or a better way I'm not thinking of to approach this..

A: 

From the [dbo] prefix, I believe you're using SQL Server. While I don't have much experience with it, you can convert both fields to a specific date format:

select * from tableA
  Left Outer join tableB
       On CONVERT(CHAR(8), tableA.myColumn, 112) = CONVERT(CHAR(8), tableB.myColumn, 112)

The same should work on any DBMS, using the appropriate date formatting functions.

I don't know about SQL Server, but in Oracle you can create an index for the join expression.

Hosam Aly
+4  A: 
SELECT [dbo].tableB.theColumnINeed
FROM   [dbo].tableA 
LEFT OUTER JOIN [dbo].tableB
ON [dbo].tableA.myColumn = 
   CASE
    WHEN [dbo].tableA.myDateColumn <= '1/1/2009' THEN FormatColumnOneWay([dbo].tableB.myColumn)
    ELSE FormatColumnAnotherWay([dbo].tableB.myColumn)
   END
Quassnoi
A: 

In SQL Server you'd use a CASE such as:

SELECT * 
FROM TableA
INNER JOIN TableB on TableA.Column=
CASE WHEN TableA.RecordDate>'1/2/08'
       THEN FormatCoumn(TableB.Column) 
     ELSE FormatColumnOtherWat(TableB.Column)
END
JoshBerke
My suggestion would be to fix the data because the optimizer will disregard indexes with those functions in the JOIN condition
SQLMenace
Yes but sometimes you can't fix the data;-)
JoshBerke
It is the same column, I would fix it, put a CHECK CONSTRAINT on it so that it won't happen again because sooner or later someone will scream that performance is unacceptable and then what?
SQLMenace
A: 

You know that this is bad for performance since you won't be able to use indexes right?

You can use a CASE statement kludge or...you can go and fix the data so that you CAN use the index and it will be many times faster

SQLMenace
A: 

Well, you could use a subquery to properly format the data in either table before the join.

SELECT
  newB.columnINeed
FROM
  tableA AS A
LEFT OUTER JOIN (
  SELECT
    columnINeed
  , CASE WHEN myColumn > '1/1/2009' THEN FormatColumnOneWay(myColumn)
    ELSE FormatColumnAnotherWay(myColumn)
    END AS myColumn
  FROM
    tableB
) AS NewB ON A.myColumn = B.myColumn

If performance matters, you could maybe used an indexed view (based on the subquery) instead of hard-coding the subquery into the overall query.

alphadogg
You may not be able to do this. I notice you are formatting B on the basis of A. My guess is you can probably format B without involving A, then do the join?
alphadogg
A: 

I agree that a CASE syntax would be more appropriate for reading purposes, although I don't know whether there's any significant difference in running time.

The "right" thing to do, really, is to re-do it and do it right to start with. Your dates should be stored in datetime columns, and you probably have quite a lot to gain on migrating all your dates in tableB to a datetime column. You could do it this way, among others:

  1. Add a dummie column to TableB with type datetime.
  2. Run a query that takes the date value from the current column and puts it in the datetime column.
  3. Rename and delete columns to match the previous data structure.
Tomas Lycken
You forgot step 4: Spend weeks or months hunting down all the errors caused in other code/reports from deleting a column
Telos
Do it in a view. Why don't people use views more? Direct table access is evil...
alphadogg
Well, datetime values not stored in datetime columns are also evil. Depending on how large the application using the database is, there might be a lot of problems - yes, but if you've used a good separation of concerns etc you won't have a lot of places to change. Why spend time hacking smelly code?
Tomas Lycken
A: 

Ok, hold up. What is the actual data type of the column? I'm guessing it isn't DateTime, because you don't really control the formatting... it just stores a date. Can it be CAST or CONVERTed to a DateTime though?

So you might want

left outer join tableb on tableA.myColumn = CAST(tableb.MyColumn as DateTime)

That way you're not matching up a string, but the actual date which should be more reliable. It's also simpler and easier to read. The real questions is why the date isn't stored as a DateTime in hte first place...

Telos
+1  A: 

Rather than having a CASE statement in the JOIN, which will prevent the query using indexes, you could consider using a UNION

SELECT [dbo].tableB.theColumnINeed 
FROM   [dbo].tableA 
    LEFT OUTER JOIN [dbo].tableB 
         ON [dbo].tableA.myDateColumn > '1/1/2009'
        AND [dbo].tableA.myColumn = FormatColumnOneWay([dbo].tableB.myColumn)
UNION ALL
SELECT [dbo].tableB.theColumnINeed 
FROM   [dbo].tableA 
    LEFT OUTER JOIN [dbo].tableB 
         ON [dbo].tableA.myDateColumn <= '1/1/2009'
        AND [dbo].tableA.myColumn = FormatColumnAnotherWay([dbo].tableB.myColumn)

but if the FormatColumnOneWay / FormatColumnAnotherWay are functions, or field expressions, that is probably going to exclude use of inxdexes on [myColumn], although any index on myDateColumn should still be used

However, it might help to understand what the FormatColumnOneWay / FormatColumnAnotherWay logic is, as knowning that may enable a better optimisation

Couple of things to note:

UNION ALL will not remove any duplicates (unlike UNION). Because the two sub-queries are mutually exclusive this is OK and saves the SORT step which UNION would make to enable it to remove duplicates.

You should not use '1/1/2009' style for string-dates, you should use 'yyyymmdd' style without and slashes or hyphens (you can also use CONVERT with an parameter to explicitly indicate that the string is in d/m/y or m/d/y style

Kristen
good feedback.. thanks.
madcolor