IGNORE_DUP_KEY = ON
basically tells SQL Server to insert non-duplicate rows, but silently ignore any duplicates; the default behavior is to raise an error and abort the entire transaction when there are duplicates in a column that doesn't allow them.
I've worked with a ton of data that normally has at least one duplicate when there shouldn't be, so I like to make use of UNIQUE
constraints when I know a value shouldn't have dups; however when I try to bulk load data the last thing I want is for it to get 90% done and then suddenly run into a duplicate and error the whole thing out (Yes, I know the obvious solution is to make sure there are no duplicates, but sometimes I'm just handed a spreadsheet filled with data and told to load it ASAP).
So, what is the reason for having the default be OFF
, and why wouldn't you want it to be on all the time so that any non-dup entries succeed while you don't have to worry about any duplicates; chances are the duplicates are in there by mistake anyway.
Is it related to performance, or something else? This seems like a great idea, but there's got to be some reason why it's not the default behavior.
Mainly, is there a good reason not to use this that I should be aware of, or should it be up for evaluating on a case-by-case basis?