tags:

views:

447

answers:

1

I have a data file with this format:

Weight Industry Type
251,787 Kellogg h
253,9601 Kellogg a
256,0758 Kellogg h
....

I read the data and try to draw an histogram with this commands:

 ce= read.table("file.txt", header= T)

 we = ce[,1]
 in = ce[,2]
 ty = ce[,3]

hist(we)

But I get this error: Error en hist.default(we) : 'x' must be numeric.
What do I need to do in order to draw histograms for my three variables ?

+4  A: 

Because of the thousand separator, the data will have been read as 'non-numeric'. So you need to convert it:

 we <- gsub(",", "", we)   # remove comma
 we <- as.numeric(we)      # turn into numbers

and now you can do

 hist(we)

and other numeric operations.

Dirk Eddelbuettel
A correction: it's not the thousand separator, it's the decimal point that in some countries is a comma. So it needs to be replaced by a point, not removed.
momobo
Yes, I replaced the comma for a point and everything worked.
José Joel.
There is an argument `sep=""` to `read.table`, `read.csv`, ... that allows you to set this at the R level.
Dirk Eddelbuettel