When program runs 1st time it just gets some fields from a source database table say:
SELECT NUMBER, COLOR, USETYPE, ROOFMATERIALCODE FROM HOUSE; //number is uniq key
it does some in-memory processing say converting USETYPE and ROOFMATERIAL to destination database format (by using cross ref table).
Then program inserts ALL THE ROWS to destination database:
INSERT INTO BUILDING (BUILDINGID, BUILDINGNUMBER, COLOR, BUILDINGTYPE, ROOFMAT) VALUES (PROGRAM_GENERATED_ID, NUMBER_FROM_HOUSE, COLOR_FROM_HOUSE, CONVERTED_USETYPE_FROM_HOUSE, CONVERTED_ROOFMATERIALCODE_FROM_HOUSE);
The above is naturally not SQL but you get the idea (the values with underscores just describe the data inserted).
The next times the program should do the same except:
- insert only the ones not found from target database.
- update only the ones that have updated color, usetype, roofmaterialcode.
My question is:
- How to implement this in efficient way? -Do I first populate DataSet and convert fields to destination format? -If I use only 1 DataSet how give destination db BUILDING_IDs (can i add columns to populated DataSet?) -How to efficiently check if destination rows need refresh (if i select them one @ time by BUILDING_NUMBER and check all fields it's gonna be slow)?
Thanks for your answers!
-matti