SQL Server 2005. I want a script to run in the near future when we are ready to deploy the new app. We have a lot of old data that must be moved to new tables in the new app.
One such set of data is sampling hours the techs have entered, which must be converted to seconds in the new app.
Sounds easy enough, but brace yourself... the old app had no validation. The new column will be bigint, but the old column was varchar. Check out the different kind of data I've encountered for hours.
24
24:00
22:57
24 HR
24hrs
24 hours
22.3 Hrs
24 hr's
n/a
3.09
19
86394
86400 Sec
24:00 valid:19:07
24 hrs / valid=13:44
15:8 (valid=15:07)
Ok, so take a deep breath, it's actually not too bad. I've done most of the hard work already, (identifying the various patterns the users have been using.) This is what have so far:
I create a function for repetitive parsing of HH or SS or HH:MM format of sampling duration.
CREATE FUNCTION HoursToSecond(@input nvarchar(5))
RETURNS bigint
AS
BEGIN
DECLARE @return bigint
SELECT @return = CASE
WHEN ISNUMERIC(@input)=1
THEN CASE
WHEN CAST(@input AS decimal(9,3))<100
THEN CAST(@input as decimal(9,3))*3600 --convert hrs to secs
ELSE CAST(@input as bigint) --already in seconds
END
WHEN CHARINDEX(':',@input)>0
THEN CAST(left(@input,CHARINDEX(':',@input)-1) as int)*3600 +
CAST(SUBSTRING(@input,CHARINDEX(':',@input)+1,2) as int)*60
ELSE NULL
END
RETURN @return
END
Then I switch based on the patterns I see in the data.
INSERT INTO NewDatasheets (sample_time)
SELECT
CASE
WHEN ISNUMERIC(samplingtime)=1 THEN dbo.HoursToSecond(samplingtime)
WHEN samplingtime LIKE '% %' THEN dbo.HoursToSecond(LEFT(samplingtime, CHARINDEX(' ',samplingtime)-1))
WHEN samplingtime LIKE '%h%' THEN dbo.HoursToSecond(LEFT(samplingtime, CHARINDEX('h',samplingtime)-1))
WHEN samplingtime LIKE '%:%' THEN dbo.HoursToSecond(samplingtime)
ELSE NULL
END
FROM OldDatasheets
Ugly script job. Yes. And I didn't even try to parse the hours after "valid". But it'll do 90% of the work. And I can query for the edge cases and clean those up by hand... but I want to avoid any manual work.
I was wondering if anyone has a better solution, perhaps with less lines of code or avoiding the creation of a function.