views:

246

answers:

4

Background:

Our C# application generates and executes queries to several types of databases (Oracle, SQL Server, MySQL), but a requirement came up to also apply them to a proprietary file format.

  1. The namespace used until now is System.Data.Common.
  2. The queries we need to apply are non-trivial (nested SELECTs, aliases in FROM, substring methods and string concatenations)

We initially converted the contents of the proprietary file to CSV, for which there exists the Driver {Microsoft Text Driver (*.txt; *.csv)}. However, the client required that no temp files are generated, and everything should happen in-memory.

Creating an ODBC driver to query the file directly seems too time-demanding. So, we are considering creating a driver (possibly ODBC), in which we will "embed" the SQLITE ODBC driver. Inside that driver, we can load the contents of the CSV files in the "in-memory" database, and then forward the query to the inner ODBC driver.

My questions:

  1. Does this solution seem feasible?
  2. Where should we start in order to create an ODBC driver from scratch?

Thanks

+1  A: 

I think your solution is time consuming. You can find existing solutions for what you are trying to do. Here are few alternates.

Why not try Linq to text/csv file?

Both solutions are in memory.

Another idea is that, you can export file in xml instead of csv or text (I always prefer export into xml when I have to process in my code). Than you use System.Xml or Linq to Xml to perform operations.

Sharique
I need the exact same query to be executed whatever the underling database may be (SQL Server, MySQL, Oracle, AND the proprietary file). So, in the case of the proprietary file, I need to include a step to "convert" it to a form in which the SQL may be executed. I don't want to create a LINQ query.
Markos Fragkakis
Ohh! I understand now. What kind of "proprietary file format" is? is it binary file format or xml like (which is true in most cases) or something else?
Sharique
It is a non-xml text file, resembling CSV. It is called PC-Axis and is used for statistical data. In the beginning it contains metadata, and then the actual data.
Markos Fragkakis
Then second link is good starting point.
Sharique
A: 

If you only are restricted not to create temporary files, you may try to use in-memory mode of SQLite. Here's a sample of connection string for it (source):

Data Source=:memory:;Version=3;New=True;

To my mind, this is simpler than building full-blown provider.$

elder_george
+1  A: 

If using SQLite as a backend is a possibility, I cannot really see why you could not use an ADO .NET provider to your SQLite database like System.Data.SQLite.


From your comment, I think you are in effect using the ODBC ADO .NET provider for all database connections (System.Data.Odbc). If you need to keep the same scheme, then a custom ODBC provider is the way to go (but then it is plain C native development, and rather painful I believe).

Another way to go would be to add a third parameter to your DB (the first two being the SQL and the connection string) : the ADO .NET provider to be used (like it is supposed to be done in configuration files, see the providerName attribute). This way you could use any ADO .NET provider available.

Then you could wrap the the SQLite provider into your own, custom, ADO .NET provider, that could include the generation and population of the SQLite DB. Advantage of this solution : pure managed .NET.

Mac
The whole idea is like this: From a desktop application we store in the DB an SQL query, as well as the ADO connection string. Then a remotely deployed WS finds these two in the DB, and executed the query.Currently (excluding the proprietary files) it is transparent to the WS what provider is used. We would like the same to be the case with the files. So, we thought of creating an ODBC driver, whose connection string contains the location of the files, and would "embed" the generation and population of an SQLite DB, and forward the received queries and operations to the internal SDLite ODBC.
Markos Fragkakis
I am skeptical : the WS cannot be totally unaware of the ADO .NET provider that is going to be used, because a plain database connection has to be created at some point (System.Data.Common.DbConnection is abstract). See my answer for more, new developments.
Mac
+2  A: 

The client request is odd. There will always be files involved, you are reading from a file.

If they want their stuff in memory (again a weird customer) put the CSVs on to a RAM Disk: http://members.fortunecity.com/ramdisk/RAMDisk/ramdriv001.htm

Building and ODBC drive is a non-trival undertaking if you care about performance and stability.

Jan Bannister
Was gonna suggest that too :)
leppie