views:

205

answers:

2

Has anyone found a good way to either merge or remove duplicates that are in custom entities? In our case we have two custom entities, literature history and subscriptions which relate contacts back to a custom entity named literature.

I can run a duplicate detection job, but this returns thousands of records and deleting them one at a time is impractical at best. We would like to either be able to merge them or just delete the duplicates. However, much Google searching has not turned up any good suggestions other than "you can write something."

Okay, but where to even get started? Should I be bulk deleting from the duplicate detection job? Should I try just writing a quick and dirty c# program with the SDK? Is there a way to merge custom entities that just requires some magical workflow voodoo?

EDIT: FYI What I eventually did was setting the deletion state code using some fun SQL to quickly find duplicates:

UPDATE T1 SET DeletionStateCode = 2 
FROM New_subscriptionhistory T1 INNER JOIN New_subscriptionhistory T2 ON t1.New_LiteratureId = T2.New_LiteratureId AND t1.New_ContactId = t2.New_ContactId
AND t1.CreatedOn > t2.CreatedOn AND t1.statecode = 0 AND t2.statecode = 0
+1  A: 

I won't say with certainty that this is the only or the best way, but we've used SQL queries in the _MSCRM database, setting the DeletionStateCode of any duplicated entity to 2.

Don
I have thought about doing this. Have you ever encountered any odd side effects?
Jeff Davis
Just an FYI, this can leave orphaned records in the system since deletions can cascade to other entities. Thus, you'll end up with orphaned activities/notes/etc. It shouldn't harm your system but you may run across records that shouldn't exist.
Focus
No odd side effects, but as Focus points out, you have to find all of the cascades and handle those appropriately. The reason we had to do this was that some of the dupes had been worked on and the bulk deletion didn't always pick the one we wanted deleted. So we had specific logic implemented in SQL to determine which were the safe records to delete. It wasn't fun, that's for sure.
Don
Yes our situation sounds similar. I may use the status to deactivate first and make sure everything goes smoothly before actually deleting them.
Jeff Davis
If you turn off the deletion service, those 2s will stay in the database. But, yes, and excellent point: the status and state codes are the safer update to make.
Don
Crucially, I should note that we are (still) on 3.0, and the only other way is a loop in custom code using the SDK. Focus has the correct answer for 4.0+.
Don
Any idea how the little edit symbol actually became part of my first comment?
Jeff Davis
+2  A: 

You should look into creating a Bulk Delete Job using the SDK.

Here's a short tutorial.

Focus
This is probably more or less the "right" way to do things. The SDK query language is not as expressive as SQL, but you can undo bulk deletes.
Jeff Davis
I agree, Bulk Delete is the way to go - safer than SQL updates.I've found this tool works well 'Bulk Delete Launcher':http://mscrmtools.blogspot.com/2009/07/new-tool-bulk-delete-launcher.htmlEssentially you create and save an Advanced Find query to identify the records you want to delete, then run the bulk delete against the saved view with this tool. It essentially does what is described in the article linked to above.
ABC123
Didn't know about this tool. Thanks for the link!
Focus