tags:

views:

54

answers:

2

Hello,

i have a Dimension Table like this :

my_Table

pk1  Primary key
pk2  Primary key
pk3  Primary key
col1  
col2  
col3  
...

and using a procedure to fill this table with the MERGE INTO statement :

MERGE INTO  my_Table dest
USING 
  ( SELECT <columns>
      FROM <tables>
     WHERE <conditions> ) src
ON 
  (dest.pk1 = src.pk1 AND dest.pk2 = src.pk2 AND pk3 = src.pk3)
WHEN     MATCHED THEN UPDATE SET dest.col1 = src.col1 ,
                                 dest.col2 = src.col2 ,
                                 dest.col3 = src.col3
WHEN NOT MATCHED THEN INSERT (pk1, pk2, pk3, col1, col2, col3) 
                      VALUES (src.pk1, src.pk2, src.pk3, src.col1, src.col2, src.col3);

my problem is that the query 'src' returns some rows with same Primary keys, but they are no duplicated rows, example

 | pk1 | pk2 | pk3 |  col1   | col2       | col3  |
 --------------------------------------------------
 | 100 | abc | x99 | 6000,00 | 01/01/2010 | 50,00 | 
 | 100 | abc | x99 | 0,00    | 01/01/2010 | 30,00 |
 | 110 | rty | b50 | 345,00  | 08/10/2009 | 10,00 |
 | 120 | xyz | y91 | 1200,00 | 13/02/2009 | 12,50 |
 | 120 | xyz | y91 | 1200,00 | 13/02/2009 |  0,00 |
 | 120 | xyz | y91 | 1200,00 | 13/02/2009 |  0,00 |
 | 120 | xyz | y91 | 1200,00 | 13/02/2009 |  0,00 |

So, when i call my procedure it returns the unique constraint violated error.

what could be the solution of this problem? im not the writer of the query src by the way ...

if i need to modify src i should have for the pk (100,abc,x99) :

| pk1 | pk2 | pk3 |  col1   | col2       | col3  |   
 --------------------------------------------------    
| 100 | abc | x99 | 6000,00 | 01/01/2010 | 80,00 | 

and for the pk (120,xyz,y91) i would have :

| pk1 | pk2 | pk3 |  col1   | col2       | col3  |  
 --------------------------------------------------    
| 120 | xyz | y91 | 1200,00 | 13/02/2009 | 12,50 |

Thanks in advance for any suggestion.

+1  A: 

It's obvious that something has to be done with src to make it return unique primary keys. If you don't care too much about the quality of the data in your table, you could use aggregation and change it e.g. to

 select pk1, pk2, pk3, max(col1), max(col2), sum(col3)
   from ...
   where ...
   group by pk1, pk2, pk3

But chances are you need more sophisticated processing to make any sense of that stuff.

EDIT: changed first sum to max, to reflect the edit in the question

ammoQ
that would be a solution, but what if i should not modify **src** ?finally i think thats what im going to do
mcha
technically, you could leave src untouched, but wrap it into another select statment; but before anything else, you should answer this question: What is the correct way to handle those duplicate lines. Choose the later one (according to col2)? Make a summary as I proposed? Ignore both?
ammoQ
after discussing about the problem, im supposed to ignore both, but i still cant figure out how?
mcha
just add "having count(*)=1" to the statement in my answer, it will give you only those rows where the keys are unique (in that case, it obviously doesn't matter whether you use min(), max() or sum() - just be sure to document that this is relatively meaningless)
ammoQ
Thank you very much, now i will just insert those duplicated rows to a new temp table :) thanks again
mcha
A: 

If you are really sure that there's no primary key conflict. Please try to check whether col1, col2 and col3 doesn't have any unique constraint. Maybe the unique constraint violation doesn't come from primary key field.

Mind to share with us your "error message"?

jancrot