ansaurus

Question

SQL query to pull unique phone numbers from 2 related tables

Answer 1

A:

Isn't it simply a query based on:

SELECT StorePhone AS Phone -- , ...other columns...
    FROM StoreInfo
UNION
SELECT AltPhone AS Phone   -- , ...other columns...
    FROM StoreInfo
UNION
SELECT Phone               -- , ...other columns...
    FROM Store_Contacts

If AltPhone can be null, you could add a WHERE clause to eliminate the nulls. I'm not clear what you mean by ANI or RBAR. You can obviously add extra columns to the different result sets as long as the net is the same. UNION eliminates duplicate rows automatically.

If a phone number already exists in the destination table, do not add it...

Ah, then you need the MERGE statement. You would MERGE into your target table using a minor variation of the query above as the source of data.

BNF for the statement from the SQL 2003 standard (section 14.9):

<merge statement> ::=
     MERGE INTO <target table> [ [ AS ] <merge correlation name> ]
     USING <table reference> ON <search condition>
     <merge operation specification>

<merge correlation name> ::= <correlation name>

<merge operation specification> ::= <merge when clause> ...

<merge when clause> ::=
    <merge when matched clause> |
    <merge when not matched clause>

<merge when matched clause> ::=
    WHEN MATCHED THEN <merge update specification>

<merge when not matched clause> ::=
    WHEN NOT MATCHED THEN <merge insert specification>

<merge update specification> ::= UPDATE SET <set clause list>

<merge insert specification>  ::=
     INSERT [ <left paren> <insert column list> <right paren> ]
     [ <override clause> ] VALUES <merge insert value list>

<merge insert value list> ::=
     <left paren> <merge insert value element>
     [ { <comma> <merge insert value element> }... ] <right paren>

<merge insert value element> ::=
     <value expression> |
     <contextually typed value specification>

You can also find descriptions of this statement in the relevant product manuals, which often offers more options. In your case, you'd probably omit the WHEN MATCHED clause, using only the WHEN NOT MATCHED clause.

Additional observation:

MS SQL Server 2005 does not support MERGE.

Not the only DBMS I know of with that limitation.

You are probably, then, faced with creating a temporary table and loading it with the data from the UNION-select statement.

You can then do inserts into your main data table from the temp table based on the non-existence of the corresponding row in the main data table. At least, some DBMS allow you to do that. I'm not an MS SQL Server expert, so I don't know whether the fine print of the UPDATE statement prevents you from selecting from the table being updated in sub-queries of the UPDATE statement. That can be a real nuisance if you are so limited.

Another option is to unload the table to plain text, and unload the UNION-select data to plain text, and then use file-system (command line) options to deal with it. How feasible that is depends on the volume of data, which I've forgotten. Perl can be useful here, reading the main table into a hash and then selectively updating it from the UNION-select data, and finally rewriting the data to a load file. Then you 'just' start the transaction, drop all the old data, load all the new data, and cross-fingers and commit. The downside of this is that changes made between unload and load are lost. So be careful if you decide to use this technique. You might want to do the unload in the transaction, and modify the data, and then delete and reload -- all in the same single transaction. It will need to be a single button-push (return-key) to do the whole job.

Jonathan Leffler 2009-04-03 22:42:43

Doesn't work. When I try to insert the result set into the destination table, it fails due to duplicate phone numbers. I will try to explain better what I am after in my responses to the comments above.

DaveN59 2009-04-06 16:13:04

Great! But, T-SQL (MS SQL Server 2005) does not support MERGE INTO... I forgot to mention the platform. Other than that one small little detail, it looks like it would do the trick. Support for the MERGE INTO syntax will be in the new SQL Server 2008, adn we're not there yet.So close!

DaveN59 2009-04-06 17:54:31

Answer 2

A:

FYI,

ANI = http://en.wikipedia.org/wiki/Automatic_Number_Identification

RBAR= row by agonizing row

qux 2009-04-04 02:14:52

Thanks for that. I forget there are people who haven't heard every TLA and industry buzzword that I have...

DaveN59 2009-04-06 15:59:43

Answer 3

+1 A:

SELECT DISTINCT sc.StoreID, dbo.GetPhoneNumber10(sc.Phone)

DISTINCT is wrong. It will allow 2 stores to share the same number. Use GROUP BY to ensure that the second column is unique.

INSERT INTO dbo.Store_PhoneNumbers (StoreID, PhoneNumber)
SELECT MIN(StoreID), PhoneNumber
FROM
(
  SELECT sc.StoreID as StoreID, dbo.GetPhoneNumber10(sc.Phone) as PhoneNumber
  FROM Store_Contacts sc
      INNER JOIN
      Store_Info si ON sc.StoreID = si.StoreID
  WHERE (dbo.IsAniNumber(dbo.GetPhoneNumber10(sc.Phone)) = 1)
      AND (si.ParentID = 0)
      AND NOT EXISTS (SELECT * FROM dbo.Store_PhoneNumbers WHERE PhoneNumber = dbo.GetPhonenumber10(sc.Phone))
) sub
GROUP BY PhoneNumber

The reason you could get away with distinct in the other queries, was that you were working with a single StoreID in them. This query returns multiple StoreIDs.

David B 2009-04-06 18:37:20

That does it! Perfect. I had tried the MIN() function before, but had the syntax wrong.

DaveN59 2009-04-06 19:38:41

Yeah, SQL is really inflexible about the order of clauses, which often leads me to write subqueries so I can group on a selected column (for example).

David B 2009-04-06 19:43:05

Answer 4

A:

I see that there's an answer already selected, but I'd be remiss if I didn't point out a simpler, and more general solution.

Instead of making the priority implicit in your order of inserting, make it explicit.

Your question is basically, "I have several sources of a datum, and I know a priority for each one. For each key, I wish to select the single datum with the highest priority."

First select all possible datums (storeid) for your key (phone):

create table prioritized_phone( phone char(12), storeid int, priority int);

insert into prioritized_phone(phone, storeid, priority) 
select storephone, storeid, 1  from store_info
union
select altphone, storeid, 2 from store_info

I don't know how you select a branch store's phone, but there's some query that gets that, probably by using parentid in storeinfo, like this:

union
select b.storephone, a.storeid, 3
from store_info a join storeinfo b on (a.parentid = b,storeid)
select b.altphone, a.storeid, 4
from store_info a join storeinfo b on (a.parentid = b,storeid)

And then the contact phones:

union 
select distinct phone, storeid, 5 from storecontacts;

Once you've done that, for each phone, delete any but the lowest (best) priority:

delete from prioritized_phone a where a.priority > 
(select min( priority) from prioritized_phone b where b.phone = a.phone);

Now for each phone we only have its minimum priority row(s). That still may not be unique by store, so we arbitraily select the lowest storeid for the phone with:

delete from prioritized_phone a where a.store_id > 
(select min( store_id ) from prioritized_phone b where b.phone = a.phone);

We now have one storeid per phone, but we may still have dupes:

create table phone_lookup( phone char(12), storeid int);

insert into phone_lookup(phone, storeid)
select distinct phone, storeid 
from prioritized_phone;

Why is this solution easier? Because it makes the priority, which was implicit in your solution (implied by order of operations) into an explicit value we can select on.

tpdi 2009-04-06 21:09:31

ansaurus

tags:

views:

answers:

SQL query to pull unique phone numbers from 2 related tables

related questions