tags:

views:

100

answers:

2

Hello,

I have a problem using data from a tab delimited data file imported with read.delim.

Most of the columns contain numerical data which I need to do a t.test for. Unfortunately I always get this error:

Error in if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my))) 
            stop("data are essentiallyconstant") :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In mean.default(x) : argument is not numeric or logical: returning NA
2: In mean.default(y) : argument is not numeric or logical: returning NA

I noticed that this only happens with vectors that consist of different levels. It won't even perform simple numerical operations like vector[1] + vector[2] for leveled vectors. Vectors without levels work fine, though.

How can I use the data in the leveled vectors for calculation?

Thank you

A: 

It is possible that some of your data may not be in numeric format after loading it. Check the structure of the data with str(your.data). If your your desired variables are not numeric you can convert them with data$var1 <- as.numeric(data$var1).

ilya
Yeah - "leveled vectors" sounds a lot like a factor to me. On read.delim, you can set the stringsAsFactors argument to FALSE, which will prevent this from happening. There's also a global stringsAsFactors option (if you look at `?read.delim`, you can see the defaults check this option). I highly recommend it.
Matt Parker
See also the R FAQ on turning factors to numeric (naked `as.numeric` doesn't work). http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-do-I-convert-factors-to-numeric_003f
Richie Cotton
+1  A: 

I have been able to reproduce your error message with the following small example:

x = as.factor(1:5)
y = as.factor(1:5)

t.test(x, y)

yields

Error in if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my))) stop("data are essentially constant") : 
  missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In mean.default(x) : argument is not numeric or logical: returning NA
2: In mean.default(y) : argument is not numeric or logical: returning NA

The problem is you are trying to perform a t-test on non-numeric vectors. Addition likewise is not defined for factors:

x + y

yields

[1] NA NA NA NA NA
Warning message:
In Ops.factor(x, y) : + not meaningful for factors

The warning gives keen insight as to what is amiss and also explains why your t-test is not working.

To fix the problem, you need to do as ilya suggests: convert your vectors to numeric with as.numeric(as.character())

Greg
The real solution is to find out why your numeric vectors are being converted to factors in the first place.
hadley
Thanks a lot.Converting the leveled vectors to numeric with as.numeric(as.character(vector)) solved the problem.Indeed the entries in the leveled vector were factoral.
jev