tags:

views:

329

answers:

2

Hi,

I am trying to specify the colClasses options in the read.csv function in R. In my data, the first column "time" is basically a character vector while the rest of the columns are numeric.

data<-read.csv("test.csv" , comment.char="" , colClasses=c(time="character","numeric") , strip.white=FALSE)

In the above command, I would want R to read in the "time" column as "character" and the as numeric. Although, the "data" variable did have the correct result after the command completed, R returned the following warnings. I am wondering how I could fix these warnings?

   Warning messages:
    1: In read.table(file = file, header = header, sep = sep, quote = quote,  :
      not all columns named in 'colClasses' exist
    2: In tmp[i[i > 0L]] <- colClasses :
      number of items to replace is not a multiple of replacement length

Thank in advance

Derek

+2  A: 

The colClasses vector must have length equal to the number of imported columns. Supposing the rest of your dataset columns are 5:

colClasses=c("character",rep("numeric",5))
gd047
one can probably use the following to read the first line of the csv and determine how many columns there are.scan(csv,sep=',', what="character" , nlines=1 )
Derek
+2  A: 

Assuming your 'time' column has at least one observation with a non-numeric character and all your other columns only have numbers, then 'read.csv's default will be to read in 'time' as a 'factor' and all the rest of the columns as 'numeric'. Therefore setting 'stringsAsFactors=F' will have the same result as setting the 'colClasses' manually i.e.,

data <- read.csv('test.csv', stringsAsFactors=F)
wkmor1