ansaurus

Question

random sample from dataframe and output into excel

Answer 1

+3 A:

To convert a factor of numbers into numeric, you must first change to character, otherwise you get the internal numbers of the factor, rather than the level labels:

as.numeric(as.character(r))

NA's are possibly introduced because of non-numeric characters in the factor levels.

James 2010-09-02 09:40:22

That helped...Appreciate it. //M

Misha 2010-09-02 10:08:09

Answer 2

A:

I'd also check why you have a factor there in the first place. It seems to me that you read it in from some text file, and that there are either spaces included somewhere, or text (a space, a point, a tab, the letters NA,...) which causes R to see the whole column as a character, and to transform it to a factor when using read.csv or the likes.

If you found it, you also know why you get NA's, and you can remediate it before saving the dataframe to a text file. Check the option stringsAsFactors=F in read.table() and read.csv() (or alternatively, as.is=T in read.csv).

Next to that, the piece of code :

a[sample(a[,1],300),]->q

is not doing what you think I guess. I'd use the indices itself, something in the line of :

a[sample.int(dim(a)[1],300),] -> q

If a becomes numeric, your code above won't work any more. It will take the values of a[,1], one of which is 01012223427. So you'd get an error, as there is no row with that index number. Also when transferring a[,1] as a character, the code you use will break.

Joris Meys 2010-09-02 10:10:10

ansaurus

tags:

views:

answers:

random sample from dataframe and output into excel

related questions