I'm trying to write code for a batch import of lots of rows into the database.
Currently I bulk copy the raw data (from a .csv file) into a staging table, so that it's all at the database side. That leaves me with a staging table full of rows that identify 'contacts'. These now need to be moved into other tables of the database.
Next I copy over the rows from the staging table that I don't already have in the contacts table, and for the ones I do already have, I need to update the column named "GroupToBeAssignedTo", indicating a later operation I will perform.
I have a feeling I'm going about this wrong. The query isn't efficient and I'm looking for advice of how I could do this better.
update [t1]
set [t1].GroupToBeAssignedTo = [t2].GroupToBeAssignedTo from Contacts [t1]
inner join ContactImportStaging [t2] on [t1].UserID = [t2].UserID AND [t1].EmailAddress = [t2].EmailAddress AND [t2].GUID = @GUID
where not exists
(
select GroupID, ContactID from ContactGroupMapping
where GroupID = [t2].GroupToBeAssignedTo AND ContactID = [t1].ID
)
Might it be better to just import all the rows without checking for duplicates first and then 'clean' the data afterwards? Looking for suggestions of where I'm going wrong. Thanks.
EDIT: To clarify, the question is regarding MS SQL.