views:

167

answers:

9

I'm supporting an existing application written by another developer and I have a question as to whether the choices the data type the developer chose to store dates is affecting the performance of certain queries.

Relevant information: The application makes heavy use of a "Business Date" field in one of our tables. The data type for this business date is nvarchar(10) rather than a datetime data type. The format of the dates is "MM/DD/YYYY", so Christmas 2007 is stored as "12/25/2007".

Long story short, we have some heavy duty queries that run once a week and are taking a very long time to execute.

I'm re-writing this application from the ground up, but since I'm looking at this, I want to know if there is a performance difference between using the datetime data type compared to storing dates as they are in the current database.

+6  A: 

You will both save disk-space and increase performance if you use datetime instead of nvarchar(10).

If you use the date-fields to do date-calculation (DATEADD etc) you will see a massive increase in query-execution-speed, because the fields do not need to be converted to datetime at runtime.

Espo
+3  A: 

Operations over DATETIMEs are faster than over VARCHARs converted to DATETIMEs.

If your dates appear anywhere but in SELECT clause (like, you add them, DATEDIFF them, search for them in WHERE clause etc), then you should keep them in internal format.

Quassnoi
A: 

There are a lot of reasons you should actually use DateTime rather than a varchar to store a date. Performance is one... but i would be concerned about queries like this:

SELECT *
FROM Table
WHERE DateField > '12/25/2007'

giving you the wrong results.

bobwienholt
A: 

I cannot back this up with numbers, but the datetime-type should be a lot faster, since it can easily be compared, unlike the varchar. In my opinion, it is also worth a shot to look into UNIX timestamps as your data type.

Ulf
Huh? Unix Timestamps? The database supports what the database supports. Is there a specific database that has something called "UNIX timestamps"?
A: 

Yes. datetime will be far more efficient for date calculations than varchar or nvarchar (why nvarchar - there's no way you've got real unicode in there, right?). Plus strings can be invalid and misinterpreted.

If you are only using the date part, your system may have a smaller date-only version of datetime.

In addition, if you are just doing joins and certain types of operations (>/</= comparisions but not datediff), a date "id" column which is actually an int of the form yyyymmdd is commonly used in datawarehouses. This does allow "invalid" dates, unfortunately, but it also allows more obvious reserved, "special", dates, whereas in datetime, you might use NULL of 1/1/1900 or something. Integrity is usually enforced through a foerign key constraint to a date "dimension."

Seeing that you tagged the question as "sql server", I'm assuming you are using some version of SQL Server, so I recommend that you look at either using datetime or smalldatetime. In addition, in SQL Server 2008, you have a date type as well as a datetime2 with a much larger range. Check out this link which gives some details

Cade Roux
A: 

Hi,

I believe from an architectural perspective a Datetime would be a more efficient data type as it would be stored as a two 4-byte integers, whereas your nvarchar(10) will be stored as up to 22 bytes (two times the number of characters entered + 2 bytes.). Therefore potentially more than double the amount of storage space is required now in comparison to using a Datetime.

This of course has possible implications for indexing, as the smaller the data item, the more records you can fit on an index data page. This in turn produces a smaller index which is of course quicker to traverse and therefore will return results faster.

In summary, Datetime is the way to go.

John Sansom
A: 

Chances are the datetime type is both more compact and faster, but more importantly using DATETIMES to store a date and time is a better architecture choice. You're less likely to run into weird problems looking for records between a certain date range and most database libraries will map them to your languages Date type, so the code is much cleaner, which is really much more important in the long run.

Even if it were slower, you'd spend more time debugging the strings-as-dates than all your users will ever see in savings combined.

Cameron Pope
really? you think the choice of MSSS or MySql or Oracle would have the biggest impact on the choice? IMNSHO, there's not a practical RDBMS alive today that would be better off with dates as chars.
I don't disagree with you at all - I've edited my answer to get rid of the confusing first sentence which is what I think was tripping you up.
Cameron Pope
A: 

The date filtering in the nvarchar field is not easy possible, as the data in the index is sorted lexicographically which doesn't match the sorting you would expect for the date. It's the problem with the date format "mm/dd/yyyy". That means "12/25/2007" will be after "12/01/2008" in a nvarchar index, but that's not what you want. "yyyy/mm/dd" would have been fine.

So, you should use a date field and convert the string values to date. You will surely get a big performance boost. That's if you can change the table schema.

MicSim
A: 

One other problem with using varchar (or any other string datatype) is that the data likely contains invalid dates as they are not automatically validated on entry. If you try to chang e the filed to a datetime field, you amay have conversion problems wher people have added dates such as ASAP, Unknown, 1/32/2009, etc. You willneed to check for dates that won't convert using the handy isdate function and either fix or null them out before you try to chnge the data type.

Likely you also have a lot of code that converts the varchar type to date datatype on the fly so that you can do date math as well. All that code will also need to be fixed.

HLGEM