tags:

views:

230

answers:

2

Installing the RODBC package on Ubuntu is a bit of a kludge. First I learned to install the following:

$ sudo apt-get install r-cran-rodbc

That wasn't good enough as the package was still looking for header files. I solved this issue by:

$ sudo apt-get install unixodbc-dev

Good, RODBC installed properly on the Ubuntu machine. But when I try to run the following script:

## import excel file from Dropbox

require("RODBC")

channel <- odbcConnectExcel("~/Dropbox/DATA/SAMPLE/petro.xls")

petro <- sqlFetch (channel, "weekly")

odbcClose(channel)

str(petro)
head(petro)

I get an error thrown that function odbcConnectExcel not found. I checked the case of each letter, making sure it was not a simple typo. Nope. Then I ran this same script on a Windows R installation (file path different, of course) and the script works.

Any idea of why Ubuntu R installation cannot find the odbcConnectExcel function and how I can get this to work?

+1  A: 

That functionality is available where Excel is available. In other words: not on Ubuntu.

For reference, from the R Data Import / Export manual (with my highlighting):

4.3.2 Package RODBC

Package RODBC on CRAN provides an interface to database sources supporting an ODBC interface. This is very widely available, and allows the same R code to access different database systems. RODBC runs on Unix/Linux, Windows and Mac OS X, and almost all database systems provide support for ODBC. We have tested Microsoft SQL Server, Access, MySQL, PostgreSQL, Oracle and IBM DB2 on Windows and MySQL, Oracle, PostgreSQL and SQLite on Linux.

ODBC is a client-server system, and we have happily connected to a DBMS running on a Unix server from a Windows client, and vice versa.

On Windows ODBC support is normally installed, and current versions are available from http://www.microsoft.com/data/odbc/ as part of MDAC. On Unix/Linux you will need an ODBC Driver Manager such as unixODBC (http://www.unixODBC.org) or iOBDC (http://www.iODBC.org: this is pre-installed in Mac OS X) and an installed driver for your database system.

Windows provides drivers not just for DBMSs but also for Excel (.xls) spreadsheets, DBase (.dbf) files and even text files. (The named applications do not need to be installed. Which file formats are supported depends on the the versions of the drivers.) There are versions for Excel 2007 and Access 2007 (go to http://download.microsoft.com, and search for Office ODBC', which will lead to AccessDatabaseEngine.exe), the 2007 Office System Driver'.

Dirk Eddelbuettel
Well of course, that explains it very well. Rats.
Milktrader
I believe there are packages that read Excel files using Perl and I would suspect they would work in Ubuntu.
Joshua Ulrich
Correct. And on Debian / Ubuntu you get the oldest one of these via `apt-get install r-cran-gdata` and it will work (for the older `.xls` format at least).
Dirk Eddelbuettel
thanks Josh, I'll look into that.
Milktrader
I gave the name of it. If you need xlsx, there is a package on CRAN. There is also no reason to un-accept this question which I answered with a clear *well because !* ;-)
Dirk Eddelbuettel
+1  A: 

I've found RODBC to be a real pain in the Ubuntu. Maybe it's because I don't know the right incantations, but I switched to RJDBC and have had much better luck with it. As discussed here.

As Dirk says, that wont solve your Excel problem. For writing Excel I've had very good luck with the WriteXLS package. In Ubuntu I found it quite easy to set up. I had Perl and many of the packages already installed and had to simply install Text::CSV_XS which I installed with the GUI package manager. The reason I like WriteXLS is the ability to write data frames to different sheets in the Excel file. And now that I look at your question I see that you want to READ Excel files not WRITE them. Hell. WriteXLS doesn't do that. Stick with gdata, like Dirk said in his comments:

gdata on CRAN and you are going to want the read.xls() function:

read.xls("//path//to/excelfile.xls", sheet = 1, verbose=FALSE, pattern, ...,
method=c("csv","tsv","tab"), perl="perl")

you may need to run installXLSXsupport which installs the needed Perl modules.

read.xls expect sheet numbers, not names. The method parameter is simply the intermediate file format. If your data has tabs then don't use tab as the intermediate format. And likewise for commas and csv.

JD Long
RJDBC... didn't I recommend that to you? ;-)
Joshua Ulrich
Of course you did! And you sent me code examples. And I will repay you with beer each and every time you are in Chicago.
JD Long
Yes, I'm mainly interested in reading them since government figures come in the format. read.xls looks like it should work nicely. Thanks.
Milktrader