views:

66

answers:

3

I maintain a PHP application with SQL Server backend. The DB structure is roughly this:

lot
===

lot_id (pk, identify)
lot_code

building
========

buildin_id (pk, identity)
lot_id (fk)

inspection
==========

inspection_id (pk, identify)
building_id (fk)
date
inspector
result

The database already has lots and buildings and I need to import some inspections. Key points are:

  • It's a one-time initial load.
  • Data comes in an Excel file.
  • The Excel data is unaware of DB autogenerated IDs: inspections must be linked to buildings through their lot_code

What are my options to do such data load?

date         inspector     result     lot_code
==========   ===========   ========   ========
31/12/2009   John Smith    Pass       987654X
28/02/2010   Bill Jones    Fail       123456B

Update: How I finally did it

In case someone else has to do a similar task, these are the steps the data load finally required:

  1. Prepare the Excel file: remove unwanted columns, give proper names to sheets and column headers, etc.

  2. With SQL Server Import / Export Wizard (32-bit version; the 64-bit version lacks this feature), load each sheet in the into a (new) database table. The wizard takes care of (most of) the dirty details, including creating the appropriate DB structure.

  3. Log into the database with your favourite client. To make SQL coding easier, I created some extra fields in the new tables.

  4. Start a transaction.

    BEGIN TRANSACTION;

  5. Update the auxiliary columns in the newly created tables:

    UPDATE excel_inspection$ SET building_id = bu.building_id FROM building bu INNER JOIN ....

  6. Insert data in the destination tables:

    INSERT INTO inspection (...) SELECT ... FROM excel_inspection$ WHERE ....

  7. Review the results and commit the transaction if everything's fine:

    COMMIT;

In my case, SQL Server complained about collation conflicts when joining the new tables with the existing ones. It was fixed by setting an appropriate collation in the new tables but the method differs: in SQL Server 2005 I could simply change collation from the SQL Server Manager (Click, Click, Save and Done) but in SQL Server 2008 I had to set collation manually in the import Wizard ("Edit SQL" button).

A: 

1) get the excel file into a CSV.

2) import the CSV file into a holding table: SQL SERVER – Import CSV File Into SQL Server Using Bulk Insert – Load Comma Delimited File Into SQL Server

3) write a stored procedure/script where you declare local variables and loop through each row in the in the holding table, building out the proper rows in the actual tables. Since this is a one time load, there is no shame in looping, and you'll have complete control over all the logic.

KM
I didn't do it exactly this way but it contains good ideas and it's the best answer so far :)
Álvaro G. Vicario
A: 

Your data would have to have natural primary keys in the data file. It looks like lot_code may be one, but I don't see one for the building table.

Also, you say that inspections are be related to buildings through lot code, yet the relationship in the table is between building and inspection.

If the data is modeled correctly, you can import to temp tables and then insert/update the target tables using the natural keys.

Ed Mays
In theory, there could be more than one building per lot (the data model allows so), but it's not an issue since current data does not contain such case.
Álvaro G. Vicario
A: 

Why CSV - why not native XLS?

I have found a PHP script named dbTube which can read native Excel files. You can create a "importer definition" which describes your import.

Excel column => mysql table column

The tool contains a Web gui to create these import definitions. Looking well - like iTunes from Apple.

unfortunately it is commercial.

Tim Burton
This has been already discussed but the answer that started the thread is gone (?). SQL Server comes with a nice wizard that's able to import/export data from a good bunch of formats, including Excel files. I've planned all the steps and I'll try to report back when I complete the load.BTW, dbTube looks really interesting and it's only 4,99 €.
Álvaro G. Vicario