Or a list of how to do in R things you do in SQL (or vise versa) ?
Thanks,
Tal
Or a list of how to do in R things you do in SQL (or vise versa) ?
Thanks,
Tal
You could look at Joshua Reich's presentation on R and SQL (see page 11).
sqldf package could be of some help here perhaps?
There is also a talk from Joshua accompanying presentation that Shane mentioned above.
It's also worth looking into the RMysQL package.
I work with very large datasets that cannot be dumped into text prior to importing in R. This package allows me to use standard mysql queries from within R to pull in subsets of my data.
The examples section at the bottom of the help(sqldf) page in the sqldf package has quite a few SQL commands and their R counterparts.
I just started working with RMySQL recently and really like the package. I just run basic SQL queries in R itself. Most of the data re-arranging is done in several independent SQL scripts, basically some stored procedures.
I think R is a statistical package with some nice merging capability but it´s not meant to handle relational data that way. I do work a lot with micro data and have to set up non-relational datasets from these micro data (and then use R for regression analysis and plotting ggplot2 (!)) . I also do data aggregation in SQL itself before connecting to R.
I also recommend to use views (if they are fast enough for you). R accesses them like ordinary tables using the list tables statement.
Besides there´s RPostgreSQL out there, if you wanna give postgreSQL a try. I tried it once but switched to RMySQL because RPostgreSQL was so hard to setup on my Mac and after an update the config was gone. RMySQL was much easier. Back then I had to compile the package on my own, so if you run another OS, you might get a binary (or there´s a Mac OS one there by now) .
In any case there is some literature on RPostgreSQL out there that might help you even if you use RMySQL, particulary if you plan to use it for timeseries data (e.g. TSPostgreSQL).