views:

81

answers:

4

I have two "date" fields that I need to join on.

The first is a normal datetime in the format yyyy-mm-dd hh:mm:ss

The second is a varchar(8) in the red-headed step child format mmddyyyy

Now this gets painful because there is no easy way to convert to the corresponding type. There is a built-in format that is yyyymmdd but that doesn't match the varchar format.

There are two paths I can see:

declare @normal_date as datetime;
declare @hated_date as varchar(8);

set @normal_date='1974-11-01 00:00:00.000'
set @hated_date='11011974'

--cast to date time with string splits
select @normal_date
where CONVERT(datetime, RIGHT(@hated_date,4)+LEFT(@hated_date,2)+SUBSTRING(@hated_date,3,2))=@normal_date

--convert normal date to ackward format
select @normal_date
      where REPLACE(CONVERT(varchar(10),@normal_date,101), '/','')=@hated_date

Which is better? Or is there a better way?

Edited to show costs

--Operator cost (39%)
CONVERT(datetime, RIGHT(@hated_date,4)+LEFT(@hated_date,2)+SUBSTRING(@hated_date,3,2))=@normal_date

--Operator cost (57%)
REPLACE(CONVERT(varchar(10),@normal_date,101), '/','')=@hated_date

--Operator cost (46%)
cast(stuff(stuff(@hated_date, 3,0, '/'),6,0,'/') as datetime)=@normal_date

--Operator cost (47%)
RIGHT(@hated_date, 4) + LEFT(@hated_date, 4)=@normal_date
+2  A: 

Try this:

select cast(stuff(stuff('11011974', 3,0, '/'),6,0,'/') as datetime)

Update

alt text

Denis Valeev
You might want to add `set dateformat mdy` before you this query.
Denis Valeev
that works, but its the same performance as the others. I will say that the "hated" format is the format that will have less records.
Nix
@Nix How do you measure performance of different methods? Run on production?
Denis Valeev
I am using explain plans.. and i am looking at both my script similar to the above, as well as the actual "production" query plans I have no doubt that there are some much needed indices on these tables. I just want to minimize operational costs
Nix
@Nix SQL Server doesn't have Explain plans. It has Execution plans.
HLGEM
The execution plan is a very poor way to compare performance of this kind of thing. You need to `SET STATISTICS IO ON` or use Query Profiler to see actual CPU usage over a million rows each.
Emtucifor
+5  A: 

This is yyyymmdd no?

RIGHT(@hated_date, 4) + LEFT(@hated_date, 4)

So, your script becomes

declare @normal_date as datetime;
declare @hated_date as varchar(8);

set @normal_date='1974-11-01 00:00:00.000'
set @hated_date='11011974'

--SELECT @hated_date = RIGHT(@hated_date, 4) + LEFT(@hated_date, 4))

select 'hurrah' WHERE @normal_date = RIGHT(@hated_date, 4) + LEFT(@hated_date, 4)
gbn
@gbn I have tested the performance of your method.
Denis Valeev
@Denis Valeev: Did you test it over a table or just one line as above?
gbn
@gbn See my answer; you were supposed to read it. :)
Denis Valeev
@Denis Valeev: I had, before your update... he he
gbn
+1  A: 

Another approach is this:

MONTH(@normal_date)*1000000 + DAY(@normal_date)*10000 + YEAR(@normal_date)
=
CAST(@hated_date AS INT)

one more thing: it is more precise to compare real execution costs than to rely on the optimizer's estimates.

AlexKuznetsov
@AlexKuznetsov Seems like your approach is the fastest according to my results. Which are somewhat biased of course and this needs to be proven on Nix's production server to be the final say.
Denis Valeev
I wouldn't say "more precise" since the execution plan costs are nearly meaningless in a case like this. I'd say "the only way".
Emtucifor
+2  A: 

Suggest you either fix the column to be datetime or add a datetime column to the table and convert the data so that you only have to do this conversion once when the data is entered (and once of course for existing data) This could probaly even be a calculated column. This is NOT something you want to be doing in select statements. If necessary create a dateconversion table with every opossible date in both formates and join to it if the table can't be changed.

You might also want to check to make sure there are no invalid dates in there which is always a possibility with storing dates in a data type other than a datetime one.

HLGEM
`If necessary create a dateconversion table with every opossible date in both formates and join to it if the table can't be changed.` Are you sure that would be faster than a simple string manipulation thingy?
Denis Valeev
If it is indexed I would expect it to be faster, but you would have to test. It could depend on the number of records involved. Conversions are generally slow. But fixing the database structure is the best choice of all. It makes it easy to do the comparisons, it makes it impossibble to put in data that is not a date and it will make it easier to do other queries where you need to do data math as well.
HLGEM
Can't change the tables. I would love to... not sure why you would go with a "varchar" date that is in a bad format.
Nix