tags:

views:

114

answers:

1

I'm using the SQlite package to interface with a database from R. However, I'm running into the issue that the results from exactly the same query are different when I run it in R or from the command-line interface. For instance, the minimum value in a column is 0, but R somehow gives the result -2147332296. As I just copy-n-paste the query, I don't think the problem is in the query. The only thing I can think of is that there might be a problem with conversion between datatypes. The maximum value in that same column is 147031553000 and the type of the column is "integer". Perhaps this value is too big for the datatype which R uses and this results in the negative value?

However, there is one more problem. For the same query, R reports less results than when I run the query in the command-line interface. Does anyone here have an idea as to why things might be going wrong?

A: 

See the R documentation for details on its datatypes, eg Section 4.2 of the R Import/Export Manual for an overview of R and RDBMS and particularly Section 4.2.2 on Data Types.

If in doubt, try casting to a floating point number as these have a wider range, at the possible expense of precision. Not all SQL types are mapped to all R types by all database packages.

Dirk Eddelbuettel
Indeed the problem seems to be in the datatypes. When making a call to dbDataType the result is "REAL". While "INTEGER" in sqlite can be upto 8 bytes, REAL will always be 8 bytes. I guess this variable size is not handled correctly by the SQLite library for R.But then the question is, how can I make sure that the values are interpreted as real? After the query has been done, I don't see how I can cast them to real, or at least making a call to as.real doesn't change the situation.
Pieter
You do cast as part of the query -- see http://www.sqlite.org/lang_expr.html for an overview and 'SELECT CAST(foo as REAL) FROM sometable' should convert the column foo to real for your table sometable.
Dirk Eddelbuettel