views:

549

answers:

3

Problem: I have a large database table (~500k records) which has a list of dates stored in a varchar2(15) column. These dates are stored in varying formats, ie. some are yyyy-mm-dd, some are mm/dd/yyyy, some are dd/mm/yy, some are mm/dd/yy, etc. Ie:

1994-01-13
01/13/1994
01/13/94
13/01/94
13/01/1994
etc

I need to be able to shift these dates slightly, for example to add 30 days to each date. (This is an oversimplification of my objective but it's easier to explain this way).

If all the dates were formatted consistently, I would achieve this as follows:

UPDATE history_table
    SET some_date_col = 
        to_char(to_date(some_date_col, 'mm/dd/yyyy')+30, 'mm/dd/yyyy') 
    WHERE some_date_col IS NOT NULL;

Due to the size of the database, I cannot afford to loop through the values one by one and parse the date value. Can anyone suggest a means to accomplish this without loops, ie with a mass UPDATE statement?

+2  A: 

IMHO, you have a bigger problem:

If some dates are dd/mm/yyyy and some are mm/dd/yyyy how can you difference which format applies for certain date?

for example, how can I know if a value "12/09/2008" means December or September?

tekBlues
Use your nose, good devs can smell the difference.
tuinstoel
+4  A: 

Are the formats of these dates really that important? They should be datetime columns. Then you could just use date math functions on that field.

Mark Canlas
I agree with you however I didn't create the database and I don't get to change it! I have to live with the structure as-is. :(
JJ
@JJ Why don't you get to change it?
tuinstoel
Well that's not relevant to my question, however I am supporting the backend db for an enterprise application. Since I can't change the application to accomodate any db schema changes, I have to leave the db structure as-is. I can only manipulate the data, not the structure.
JJ
You can add an extra column or you can create a view with the same name as the table and build some instead of triggers. The application doesn't have to know.
tuinstoel
This is a mess plain and simple, and if you can have DD/MM/YYYY and MM/DD/YYYY your data is effectively corrupted. The only answer I can offer for this is to use dates for dates - never store a date as a string. Apart from your problem, doing efficient date range scans etc become impossible if you don't have dates as dates.If I had this mess to deal with, I would to_date each string, use the date arithmetic functions on it, and then to_char them back again, catching exceptions as you go.
Stephen ODonnell
+3  A: 

well, you've got a real problem here. 07/07/1994 is valid for 'MM/DD/YYYY' and 'DD/MM/YYYY' However, outside of that issue, you can try nesting decodes. I entered the following dates into a varchar field:

01/12/2009, 01-12-2009, 2009-01-12, 01/12/09

and using the below, I was consistently returned 1/12/2009. You'll have to figure out all the patterns possible and keep nesting decodes. The other thing you could do is create a function to handle this. Within the function, you can check with a little more detail as to the format of the date. It will also be easier to read. You can use the function in your update statement so that should be faster than looping through, as you mentioned. (for what its worth, looping through 500k rows like this shouldn't take very long. I regularly have to update row by row tables of 12 million records)

select mydate, decode(instr(mydate,'-'),5,to_date(mydate,'YYYY-MM-DD'),3,to_date(mydate,'MM-DD-YYYY'), decode (length(mydate),8,to_date(mydate,'MM/DD/YY'),10,to_date(mydate,'MM/DD/YYYY'))) from mydates;

and here is the update statement:

update mydates set revdate = decode(instr(mydate,'-'),5,to_date(mydate,'YYYY-MM-DD'),3,to_date(mydate,'MM-DD-YYYY'), decode (length(mydate),8,to_date(mydate,'MM/DD/YY'),10,to_date(mydate,'MM/DD/YYYY')))

moleboy
This looks like the best option for me, I'll give it a try, thanks!
JJ
good luck. Let me know if you need anything else. I spent years doing data conversions on crap just like this.
moleboy