We have a need to update several tables that have parent/child relationships based on an Identity primary-key in the parent table, which is referred to by one or more child tables as a foreign key.
- Due to the high volume of data, we would like to build these tables in memory, then use SqlBulkCopy from C# to update the database en mass from either the DataSet or the individual DataTables.
- We would further like to do this in parallel, from multiple threads, processes, and possibly clients.
Our prototype in F# shows a lot of promise, with a 34x performance increase, but this code forces known Identity values in the parent table. When not forced, the Identity column does get correctly generated in the database when SqlBulkCopy inserts the rows, but the Identity values do NOT get updated in the in-memory DataTable. Further, even if they were, it is not clear if the DataSet would correctly fix-up the parent/child relationships, so that the child tables could subsequently be written with correct foreign key values.
Can anyone explain how to have SqlBulkCopy update Identity values, and further how to configure a DataSet so as to retain and update parent/child relationships, if this is not done automatically when a DataAdapter is called to FillSchema on the individual DataTables.
Answers that I'm not looking for:
- Read the database to find the current highest Identity value, then manually increment it when creating each parent row. Does not work for multiple processes/clients and as I understand it failed transactions may cause some Identity values to be skipped, so this method could screw up the relation.
- Write the parent rows one-at-a-time and ask for the Identity value back. This defeats at least some of the gains had by using SqlBulkCopy (yes, there are a lot more child rows than parents ones, but there are still a lot of parent rows).
Similar to the following unanswered question: