I have a large list of email addresses, and I need to determine which of them are not already in my database. Can I construct a query that will tell me this information?
I am using SQL Server 2000, if it requires non-standard extensions
I have a large list of email addresses, and I need to determine which of them are not already in my database. Can I construct a query that will tell me this information?
I am using SQL Server 2000, if it requires non-standard extensions
For a huge list, I would recommend loading that list into a second table (e.g., TEMP_EMAIL_ADDRESS
), then use:
SELECT
EMAIL
FROM
TEMP_EMAIL_ADDRESS
WHERE
EMAIL NOT IN (SELECT EMAIL FROM EMAIL_ADDRESS)
Data Transformation
If your data is in a text file named emails.txt (one row per line), you can create the insert statements using the following DOS command:
FOR /F %i IN (emails.txt) DO echo INSERT INTO TEMP_EMAIL_ADDRESS (EMAIL) VALUES ('%i') >> insert-email.sql
That command will create a new file called insert-email.sql in the current directory, containing all the inserts you need to inject the existing list of e-mail addresses into the database.
Why don't you import the email addresses as a table and just check that way against whatever table you already have?
Other than that, you could pass in the list of email addresses as an XML datatype or a table datatype and query against that.
make a temporary table, load the e-mailaddresses into the temporary table and then do a NOT IN query such as
SELECT emailaddress FROM temp_table WHERE emailaddress NOT IN (SELECT emailaddress FROM table)
you could extend that with an INSERT INTO
Create a table "newaddresses" that has all of your query email addresses.
Do something like this:
(off the top of my head, syntax may not be 100% correct; hopefully gives you the right idea)
select n.email
from newaddresses n
left join alreadysubscribed a
on n.email = a.email
where a.email is null
This joins the two tables on the email addresses, and then gives you a list of only the entries where the join failed.
Maybe you could create an index on emails:
CREATE INDEX email_index ON my_table(email)
and then simply check with
SELECT email FROM my_table WHERE my_table.email = some_email
Hmm... Nobody mentioned IF EXISTS. That would have been my first thought:
IF NOT EXISTS (SELECT * FROM MyTable where MyTable.email = "[email protected]
Although, perhaps the performance is no different from NOT IN.
Hope this helps.