views:

148

answers:

4

Background:

I have one Access database (.mdb) file, with half a dozen tables in it. This file is ~300MB large, so not huge, but big enough that I want to be efficient. In it, there is one major table, a client table. The other tables store data like consultations made, a few extra many-to-one to one fields, that sort of thing.

Task:

I have to write a program to convert this Access database to a set of XML files, one per client. This is a database conversion application.

Options:

(As I see it)

  1. Load the entire Access database into memory in the form of List's of immutable objects, then use Linq to do lookups in these lists for associated data I need.

    • Benefits:
      • Easy parallelised. Startup a ThreadPool thread for each client. Because all the objects are immutable, they can be freely shared between the threads, which means all threads have access to all data at all times, and it is all loaded exactly once.
    • (Possible) Cons:
      • May use extra memory, loading orphaned items, items that aren't needed anymore, etc.
  2. Use Jet to run queries on the database to extract data as needed.

    • Benefits:
      • Potentially lighter weight. Only loads data that is needed, and as it is needed.
    • (Possible) Cons:
      • Potentially heavier! May load items more than once and hence use more memory.
      • Possibly hard to paralellise, unless Jet/OleDb supports concurrent queries (can someone confirm/deny this?)
  3. Some other idea?

What are StackOverflows thoughts on the best way to approach this problem?

A: 

From the sounds of this, it would be a one-time operation. I strongly discourage the actual process of loading the entire setup into memory, that just does not seem like an efficient method of doing this at all.

Also, depending on your needs, you might be able to extract directly from Access -> XML if that is your true end game.

Regardless, with a database that small, doing them one at a time, with a few specifically written queries in my opinion would be easier to manage, faster to write, and less error prone.

Mitchel Sellers
Field types don't match up, some fields need merging, *alot* of fields need adding with default values in the XML so the new application doesn't get too upset when importing it, etc. Like I said, it's a conversion; from one product (Access 97 :( ) to another in this case.
Matthew Scharley
Ok, in that case, the direct access route will not work, but I'd still go with Jet
Mitchel Sellers
A: 

I would lean towards jet, since you can be more specific in what data you want to pull.

Also I noticed the large filesize, this is a problem i have recently come across at work. Is this an access 95 or 97 db? If so converting the DB to 2000 or 2003 and then back to 97 will reduce this size, it seems to be a bug in some cases. The DB I was dealing with claimed to be 70meg after i converted it to 2000 and back again it was 8 meg.

Jambobond
97. And I don't personally use Office (though, I managed to find someone with a copy of '97 lieing around gathering dust) in favour of OpenOffice, so I don't have more recent versions to try against. That said, some of the tables have hundreds of thousands of records, so I don't think the filesize is inflated overly much.
Matthew Scharley
cool just thought i would mention it :)
Jambobond
+1  A: 

Generate XML parts from SQL. Store each fetched record in the file as you fetch it.

Sample:

SELECT '<NODE><Column1>' + Column1 + '</Column1><Column2>' + Column2 + '</Column2></Node>' from MyTable
Cătălin Pitiș
+1  A: 

If your objective is to convert your database to xml files, you can then:

  1. connect to your database through an ADO/OLEDB connection
  2. successively open each of your tables as ADO recordsets
  3. Save each of your recordset as a XML file:

    myRecordset.save myXMLFile, adPersistXML

If you are working from the Access file, use the currentProject.accessConnection as your ADO connection

Philippe Grondier
The objective is to convert it the database to XML exports from a different product (so that we can then import it to said product), hence it's not so simple. I mentioned this in a comment on another answer already.
Matthew Scharley
@Matthew: what about using XSLT to transform the ADO XML as required?
onedaywhen
I've never used XSLT before, so I couldn't comment on it's usefulness in this situation. Though, there's new fields that need adding, some fields need combining, some text fields need reworking into RTF text fields, etc...
Matthew Scharley