views:

67

answers:

2

Hello! I am passing data from C# to R over a COM interface. When the data arrives in R it is housed in a 'Matrix'. Some of the functions that I use require that the data be inside a 'DataFrame' instead. I convert the data structure using

newDataFrame <- as.data.frame(oldMatrix)

The table of data reaches R just fine, once I make the conversion to the DataFrame however, it assumes all of my numeric data are factors!

So it turns: {34, 46, 90, 54, 69, 54} into {1, 2, 3, 4, 5, 4}

My data table DOES have factors in it though, so I just can't force the whole thing to be numeric. Is there any way around this? Note: I can't export the data as a CSV onto the filesystem and read it into R manually.

On a side note, the function I am using that requires a DataFrame is the 'Hmisc' package using

hist.data.frame(dataFrame)

this produces a frequency histogram for every column of data in the DataFram and arranges them in all in a grid pattern (quite nifty)!

Thanks! -Dave

+1  A: 

I've had this problem before. You need to set stringsAsFactors=F when you read the data.

Now, you can convert individual variables/columns to factors (ie, with as.numeric() and the like), without worrying about how the numbers are treated.

thebackhand
This worked! But is there a way that I can programmatically handle numeric vs. vector columns, I am dealing with huge amounts of data, and analyzing something like that by hand is going to be impractical.
Dave
Perhaps. Do you have a simple way of differentiating between numeric variables and factor variables?
thebackhand
If the variable contains letters it's going to be a factor, otherwise it should be treated as numeric I would guess.My problem arises because R detects my numeric variables as strings, which must not be treated like factors.
Dave
Well, I'm not sure of the exact format of your data, but I'd try creating some function that builds off of is.character() and checks each variable and converts it accordingly. Check out the apply family and its cousins (lapply, tapply, etc.) for good ways to loop in R like this.
thebackhand
also found thishttp://lib.stat.cmu.edu/S/Harrell/help/Hmisc/html/all.is.numeric.htmlthanks for all your input!
Dave
+1  A: 

I think you have mis-diagnosed the problem - all columns in a matrix must be of the same type, so this is likely to be where the problem arises, not the conversion to a data frame.

hadley