tags:

views:

110

answers:

3

Which of the dataset formats listed at this link is the easiest to load for processing in R? A few minutes with a text editor should be enough to turn the text version into literal data but can one of the other forms be loaded in less than O(n) user effort?

I've found this laundry list of IO options but it dosn't seem especially helpful.


P.s. I've never used R before and am trying to help a friend who is the one that needs to do this.

+3  A: 

If everything else fails, why not read the manual devoted to Data Import / Export ?

You can import data from

  • ascii files with whichever delimiter (csv, txt, ...)
  • fixed form files
  • binary files in various formats (hdf5, netcdf, ...)
  • spreadsheets, in most formats even on non-Windows platforms
  • databases (DBI, RODBC, ...)
  • web pages (using the XML package)
  • web services like SOAP, JSON, ...
  • directly from other programs using connections, ...
  • and more

so calling any one of these preferred is diffcult -- it all depends on the task at hand.

Dirk Eddelbuettel
I did look at it, check my second link. I could probably figure out how to get it done from that but to do so, I'd have to 1) install R, 2) learn some of R 3) figure out what that page is talking about and 4) write something up from scratch for the person I'm trying to help because I know that page won't help them.
BCS
As to the task at hand, I assume they are going to analyze the data. The question is: which of the specific list of formats that the data can be downloaded in is easiest to load.
BCS
@ Dirk, your answer could be interpreted as facetious. I indentify with all the elements of the question tht could engeneder animosity. I suggest that if you feel animosity you skip the question.
Farrel
Farrell: FWIW I couldn't disagree more. OP was lost, did not specify his question well at all so he had to be directed to the manual and the plethora of available data import options.
Dirk Eddelbuettel
+4  A: 

Grab the text files and follow the instructions in the spreadsheet-like data section of R Data Import/Export. I would avoid trying to read from Excel files unless you absolutely have to.

It could be as easy as:

x <- read.table("file.txt", header=TRUE, sep="\t")
# or
x <- read.delim("file.txt") # header=TRUE and sep="\t" are already defaults
Joshua Ulrich
+1  A: 

From the options you have available, the tab delimited text files are the easiest to import. Followed by the SPSS files and then everything else. I agree with other posters, avoid files with .xls (or convert single sheet workbooks into tsv, csv.

The foreign package can be used to open those SPSS files which is just as easy:

install.packages("foreign")
library(foreign)

setwd("/Path/to/your/files")
read.spss("FILENAME.sav", to.data.frame=T)
Brandon Bertelsen
There is no need to install this package as it is one of the recommended packages that come with R.
Gavin Simpson