tags:

views:

351

answers:

6

I am currently writing a Delphi application that runs queries over a DB2 database using ADO.

One of the requirements is that the user should be able to define queries using dates, for example "show me all data from the last 60 days" or "show me all data between November 20th 2009 and January 18th 2010". This wouldn't be an issue, except for two facts:

  1. Dates are stored in the database using different fields for day, month and year.
  2. The databases have been in use at several customer sites for a number of years and cannot be changed, and the application therefore has to run over the existing databases (so no changing the database to store dates in one field, which would make the task much simpler).

What I need to know is whether there is an efficient algorithm for constructing the SQL required to extract the specified information from the database. For example, today is January 18th, so to extract all information from November 20th to today, I would need an SQL statement that goes something like this:

SELECT data WHERE 
((day >= 20) AND (month = 11) AND (year = 2009)) OR 
((month = 12) AND (year = 2009)) OR 
((day <= 18) AND (month = 1) AND (year = 2010))

Obviously this is a trivial and relatively straightforward example, but if the user wanted to retrieve data from November 2008 instead of 2009, the query would get a lot bigger.

Is this the only way that I can construct the SQL statement, or is there a more efficient way to do this?

+1  A: 

You could construct it to covert the year/month/day into a date data type, but that would preclude the use of indexes, so performance might be a problem.

Does db2 have calculated columns on which you could index and would you be able to add that? Or maybe a view which can have indexed/persisted calculated columns?

If you have indexes on the year/month/day though, the code that you have would probably be best for performance.

Tom H.
+6  A: 

In DB2, the following should convert your separate columns into a date type: date(year||month||day)

If you have two dates, you can use subtraction to receive the number of days between the two, so your where clause can include

date(year||month||day) > CURRENT_DATE - 60

Galghamon
How would I go about including this in a query though? Say for example that I wanted to retrieve all data from date = 2009-12-31 to today. How does date() fit into the query string? Am I right in thinking that date() takes three integer values as parameters (year, month, day)? Where do I reference the fields associated with the date? Some of the database tables contain more than one piece of date information so I obviously need to be querying the correct fields.
Jeedee
Date() takes a string - I am concatenating your existing three fields using the double-pipe || operator. You should be able to literally copy and paste my statement into your where clause to get the last 60 days. if you want everything from the end of last year (including year end) then you could use "WHERE date(year||month|date) >= date('20091231')". I am assuming your single-digit months and days are stored with a leading zero, so you may have to format those columns and left pad each with '0' to get to two digits. For comparisons, you could drop the date(), since alpha sort will do.
Galghamon
I tried using the DATE() function and it doesn't work. I get an error ("Argument *N of function CONCAT not valid"), so obviously specifying the fields in the DATE() function, concatenated together, doesn't work (unless I've totally misunderstood where it is that I'm supposed to be referencing the fields that are to be queried). Not to worry - I'll just go with the accepted answer. I thought your response was worth a try though.
Jeedee
That's because your values are numbers, not strings and concatenation doesn't work on numbers in DB2, it seems (i normally work with Oracle). You could use the DIGITS() function, provided the precision of your numbers is 4 for the year, and 2 for the month and date fields. DIGITS will left-pad the number with 0 as required up to the precision of the field. So the complete example from my original answer would be DATE(DIGITS(year)||DIGITS(month)||DIGITS(day)) > CURRENT_DATE -60. If the precision is larger, you'll have to do SUBSTR(DIGITS(year),1,4) to trim it down (for each field).
Galghamon
A: 

In MSSQL, I would try to write a function to return the number of days since a your data's date. In Firebird, I would do the same thing with a Delphi user defined function or at least a selectable stored procedure. I don't know if it is possible to extend the DB2 SQL with either of those two options, but if you could wrap the logic up in something like that, it would make your job much easier. I see mention of a DB2 SQL UDF in the Related posts on the right side of the page. I hope this is helpful.

This has to be the most convoluted option. You want to write a function that will have to take into account the varying number of days per month, leap vs. non-leap years, and the exceptions to the leap-year rule?
Craig Young
I agree with you Craig. I was not suggesting going down that path either. I am not familiar with DB2 SQL so I did not know that there was a function that would make it as easy as what Galghamon suggested. I was afraid it might be more difficult to construct a date from a day, month and year and putting a bunch of code inside a function or udf would make it much easier to use.
A: 

since you want to compare to a date you should convert the non-date values to date first. That enables you to compare date against date and from there it's easy

Joe Meyer
Same as Galghamon's answer; but the problem is that you lose the benefit of indexes.
Craig Young
+1  A: 

The answer provided by Galghamon http://stackoverflow.com/questions/2087589/sql-date-interval-algorithm/2087712#2087712 is efficient in terms of code, but won't be able to leverage indexes.

I suggest you perform all necessary calculations in Delphi so that your logic can be conceptually reduced to one of the following:

(DateCol > @DateVal)
(DateCol >= @DateVal)
(DateCol < @DateVal)
(DateCol <= @DateVal)
(DateCol = @DateVal)

These can then be expanded for your rather unfortunate schema as follows (using 1st 2 examples):

(YearCol > @YearVal OR (YearCol = @YearVal AND MonthCol > @MonthVal) OR (YearCol = @YearVal AND MonthCol = @MonthVal AND DayCol > @DayVal))
(YearCol > @YearVal OR (YearCol = @YearVal AND MonthCol > @MonthVal) OR (YearCol = @YearVal AND MonthCol = @MonthVal AND DayCol >= @DayVal))

NOTE: You need to be precise about where to use inclusive and where to use exclusive inequality operators.

Craig Young
A: 

Regarding your remarks of the database changes. Consider adding a single column for the date. This column will only be written by a trigger and when at least one of the columns day, month, and year are updated. You have then the freedom of indexing this column and using it.

You won't impact any existing software and you have the convenience of the date column. Remember, the code you write today, need to be readable in 6 months by another developer.

So if you still can not make the db change consider Galghamon's solution, since it is easy to understand (low maintenance costs). Only go with Craig Young's solution if the performance impacts force you to do it.

Peter Schuetze
The database is on the server and all of the server side software (as well as database maintenance) is carried out by another team. The remit of my project is to be able to run a query from a PC over the existing database/program structure on the server, so unfortunately I can't even add triggers etc.My first thought was of course to add a single date column but my hands are tied on this one, hence my question.
Jeedee