I'm building an application to import data into a sql server 2008 Express db.
This database is being used by an application that is currently in production.
The data that needs to be imported comes from various sources, mostly excel sheets and xml files.
The database has the following tables:
- tools
- powertools
- strikingtools
- owners
Each row, or xml tag in the source files has information about 1 tool:
name, tooltype, weight, wattage, owner, material, etc...
Each of these rows has the name of the tool's owner this name has to be inserted into the owners table but only if the name isn't already in there.
For each of these rows a new row needs to be inserted in the tools table.
The tools table has a field owner_id with a foreign key to the owners table where the primary key of the corresponding row in the owners table needs to be set
Depending on the tooltype a new row must be created in either the powertools table or the strikingtools table. These 2 tables also have a tool_id field with a foreign key to the tools table that must be filled in.
The tools table has a tool_owner_id field with a foreign key to the owners table that must be filled in.
If any of the rows in the importfile fails to import for some reason, the entire import needs to be rolled back
Currently I'm using a dataset to do this but for some large files (over 200.000 tools) this requires quite a lot of memory. Can anybody think of a better aproach for this?