ansaurus

Question

Update query with 'not exists' check causes primary key violation

Answer 1

A:

Without seeing your data, I believe your analysis is correct - the entire set is updated and then the commit fails since it results in a constraint violation. An EXISTS is never re-evaluated after "partial commit" of some of the UPDATE.

I think you need to more precisely define your rules regarding attempting to change multiple products to the same product according to the merged_product_id and then make those explicit in your query. For instance, you could exclude any products which would fall into that category with a further NOT EXISTS with appropriate query.

Cade Roux 2010-01-13 16:50:10

Answer 2

+1 A:

You can use MERGE if you are on SQL 2008 at least.

Otherwise you're going to have to choose a criteria to establish which merged_product_id you want in and which one you leave out:

update cp
set cp.product_id = p.merged_product_id
from Company_Product cp
cross apply (
  select top(1) merged_product_id
  from Product 
  where product_id = cp.product_id
  and p.merged_product_id is not null
  and not exists (
    select * from Company_Product cp2 
    where cp2.company_id = cp.company_id and 
    cp2.product_id = merged_product_id)
  order by <insert diferentiating criteria here>) as p

Note that this is not safe if multiple concurrent requests are running the merge logic.

Remus Rusanu 2010-01-13 16:51:06

Sorry, can't use SQL 2008 as it has to run on SQL 2005 as well.

tomlog 2010-01-13 17:08:20

Answer 3

+2 A:

A primary key is supposed to be three things:

Non-null
Unique
Unchanging

By altering part of the primary key you're violating requirement #3.

I think you'd be better off creating a new table, populating it, then drop the constraints, drop the original table, and rename the new table to the desired name (then of course, re-apply the original constraints). In my experience this gives you the chance to check out the 'new' data before making it 'live'.

Share and enjoy.

Bob Jarvis 2010-01-13 16:53:46

You were the closest in suggesting the solution, so you get the answer. I used a temporary table instead, as described above.

tomlog 2010-01-14 12:06:27

Answer 4

+1 A:

I can't quite see how your structure is meant to work or what this update is trying to achieve. You seem to be updating Company_Product and setting a (new) product_id on an existing row that apparently has a different product_id; e.g., changing the row from one product to another. This seems...an odd use case, I'd expect you to be inserting a new unique row. So I think I'm missing something.

If you're converting Company_Product to using a new set of product IDs instead of an old set (the name "merged_product_id" makes me speculate this), are you sure that there is no overlap between the old and new? That would cause a problem like what you're describing.

T.J. Crowder 2010-01-13 16:54:02

Yes, you're correct that I'm converting the old product IDs in Company_Product to the new set, specified in merged_product_id.And yes, there could be overlap, because there might already be an entry for the product_id and company_id. The question is how to prevent the query from trying to insert duplicates (which cause the PK violation).

tomlog 2010-01-13 17:04:36

@tomlog: I don't think you'll be able to do an update in place, then, because (for example) when replacing the old product ID 23 with the new ID 35, it may conflict with an old product ID 35 already in the table. You'll have to create a new table, populating it via a join of the old one, and then swap it in. I also note that the sample data you added has a NULL for merged_product_id, which means that you'll be leaving the old ID 27 lying around in the table, which would be a Bad Thing(tm) if there is also a new ID 27.

T.J. Crowder 2010-01-13 17:19:36

@T.J.: not all products are merged, so it's fine for ID 27 to stick around. Plus, all product IDs are unique, so there won't ever be a new product with ID 27.

tomlog 2010-01-13 17:27:15

Answer 5

A:

I think you are correct on why the update is failing. To fix this, run a delete query on your company_product table to remove the extra product_ids where the same merged_prduct_id will be applied.

here is a stab at what the query might be

delete company_product
  where product_id not in (
    select min(product_id)
      from product
      group by merged_product_id
  )
  and product_id not in (
    select product_id
      from product
      where merged_product_id is null
  )

-- Explanation added in resonse to comment --

What this tries to do is to delete rows that will be duplicates after the update. Since you have products with multiple merged ids, you really only need one of those products (for each company) in the table when you are done. So, my query (if it works...) will keep the min original product id for each merged product id - then your update will work.

So, let's say you have 3 product ids which will map to 2 merged ids: 1 -> 10, 2 -> 20, 3 -> 20. And you have the following company_product data:

product_id  company_id
1           A
2           A
3           A

If you run your update against this, it will try to change both the second and third rows to product id 20, and it will fail. If you run the delete I suggest, it will remove the third row. After the delete and the update, the table will look like this:

product_id  company_id
10          A
20          A

Ray 2010-01-13 16:54:38

But surely by deleting rows I would lose my Company-Product link? How would I know which company a product was linked to when I'm trying to added the merged_product_id?

tomlog 2010-01-13 17:01:48

@tomlog - edited answer in response

Ray 2010-01-13 17:27:28

I can see now where you're going with this, but your query is not quite right as it would also delete rows where there wasn't a merged_product_id (=NULL, and it would only select the first product_id and delete the rest).

tomlog 2010-01-13 17:30:45

I didn't consider nulls - I fixed the delete query to deal with them. Unless I am looking at it wrong, the delete will remove only product ids that will result in duplicate records (key violations) in the update - you can't keep these anyway

Ray 2010-01-13 18:40:59

aha - I see what you mean - my delete query won't work. Sorry to waste your time. I think your best bet is to run a select joining the two tables to get the correct ids into a temp table, empty out the c_p table, and then insert into c_p from the temp table with a select distinct.

Ray 2010-01-13 19:24:12

Answer 6

A:

Try this:

create table #Company_Product
(product_id int, company_id int)
create table #Product (product_id int,merged_product_id int)
insert into #Company_Product
select           23, 2     
union all select 24, 2     
union all select 25, 2     
union all select 26, 3     
union all select 27, 4
insert into #product 
Select              23, 35     
union all select    24, 35     
union all select    25, 12     
union all select    26, 35     
union all select   27, NULL 

update cp 
set product_id = merged_product_id
from #company_product cp
join
  ( 
    select min(product_id) as product_id, merged_product_id  
      from #product where merged_product_id is not null
      group by merged_product_id 
  ) a on a.product_id = cp.product_id

delete cp 
--select *
from #company_product cp
join #product p on cp.product_id = p.product_id
where cp.product_id <> p.merged_product_id
and p.merged_product_id is not null

HLGEM 2010-01-13 18:59:39

This query will only work when the Company_Product table has no primary key.

tomlog 2010-01-14 09:40:41

ansaurus

tags:

views:

answers:

Update query with 'not exists' check causes primary key violation

related questions