tags:

views:

100

answers:

1

Hi,

I have imported a time series with dates of the following format:

 test = c("11-Feb-01","12-Feb-01","01-Mai-08")

This yields:

> as.Date(test, "%d-%b-%y")
[1] NA           NA           "2008-05-01"

Since, May was translated it obviously takes locale into account.

According to the docs, the %b should be the abbreviated month name, but I guess there's might be some issue there.

How would I go about fixing this?

I'm running R under Linux t2.6.27-9-generic #1 SMP


Update: Digging a bit deeper i find that the issue is in the LC_TIME definition, where the appropriate abbrivations are of the form:

"jan.","feb.","mars", "apr", "mai", "juni", "juli", "aug.","sep.","okt.","nov.", "des."

while my data contains:

"Jan", "Feb", "Mar", "Apr", "Mai", "Jun", "Jul", "Aug", "Sep", "Okt", "Nov", "Des"

I guess I could consider pre-processing the data, but a smooth way of doing this in R would be most welcome.


This works sort-of, but not so elegant:

> as.Date(gsub("Feb","feb.",test), "%d-%b-%y")
[1] "2001-02-11" "2008-02-12" "2008-05-01"

Thanks!

+1  A: 

The closest thing I've found to a solution is to make multiple iterations over the data to replace the names of the months with something that can be parsed.

I'm not sure if this is the best solution.

setwd("/home/tovare/Data")

v <- read.csv2("valuta_dag.sdv", 
  na.strings = c("NA","ND"), 
  header = TRUE, sep=";", skip=2)

v$Dato <- gsub("Jan","01",v$Dato)
v$Dato <- gsub("Feb","02",v$Dato)
v$Dato <- gsub("Mar","03",v$Dato)
v$Dato <- gsub("Apr","04",v$Dato)
v$Dato <- gsub("Mai","05",v$Dato)
v$Dato <- gsub("Jun","06",v$Dato)
v$Dato <- gsub("Jul","07",v$Dato)
v$Dato <- gsub("Aug","08",v$Dato)
v$Dato <- gsub("Sep","09",v$Dato)
v$Dato <- gsub("Okt","10",v$Dato)
v$Dato <- gsub("Nov","11",v$Dato)
v$Dato <- gsub("Des","12",v$Dato)

v$Dato <- as.Date(v$Dato,"%d-%m-%y")
tovare
I think you are right on. You can either change the LC_TIME definitions, or change your data. I am a big fan of not messing with LC_TIME unless there is no other workaround. You have an alternative to molesting LC_TIME so I would use it.
JD Long