views:

310

answers:

5

I use an SQL statement to remove records that exist on another database but this takes a very long time.

Is there any other alternative to the code below that can be faster? Database is Access.

email_DB.mdb is from where I want to remove the email addresses that exist on the other database (table Newsletter_Subscribers) customers.mdb is the other database (table Customers)

SQLRemoveDupes = "DELETE FROM Newsletter_Subscribers WHERE EXISTS (select * from [" & strDBPath & "Customers].Customers " _
      & "where Subscriber_Email = Email or Subscriber_Email = EmailO)"

NewsletterConn = "Driver={Microsoft Access Driver (*.mdb)};DBQ=" & strDBPath & "email_DB.mdb"

Set MM_editCmd = Server.CreateObject("ADODB.Command")
MM_editCmd.ActiveConnection = NewsletterConn
MM_editCmd.CommandText = SQLRemoveDupes
MM_editCmd.Execute
MM_editCmd.ActiveConnection.Close
Set MM_editCmd = Nothing

EDIT: Tried the SQL below from one of the answers but I keep getting an error when running it:

SQL: DELETE FROM Newsletter_Subscribers WHERE CustID IN (select CustID from [" & strDBPath & "Customers].Customers where Subscriber_Email = Email or Subscriber_Email = EmailO)

I get a "Too few parameters. Expected 1." error message on the Execute line.

A: 

Assuming there's an ID-column present in the Customers table, the following change in SQL should give better performance:

"DELETE FROM Newsletter_Subscribers WHERE ID IN (select ID from [" & strDBPath & "Customers].Customers where Subscriber_Email = Email or Subscriber_Email = EmailO)"

PS. The ideal solution (judging from the column names) would be to redesign the tables and code logic of inserting emails in the first place. DS

Yes, the Customers table (customers.mdb) has a CustID and Newsletter_Subscribers table (email_DB.mdb) as a Subscriber_ID. So i should replace the ID with CustID on your SQL?
smartins
smartins
A: 

Try adding an Access Querydef and calling that.

le dorfier
A: 

It sounds like you do not have an index on the subscriber_enail field. This forces a table scan ( or several). Add an index on this field and you should see significant improvement.

I would have coded the query

DELETE FROM Newsletter_Subscribers where (Subscriber_Email = Email or Subscriber_Email = EMail0)
Matthew
Subscriber_Email is indeed indexed (Yes (No Duplicates)). Your query doesn't seem to take into consideration that the tables are on two seperate database files (mdb).
smartins
Since you mentioned the index, I did some more experimenting and removed "OR Subscriber_Email = EMailO" from the SQL. It completed in less than a second. Any idea of why this might be? EmailO is also indexed (Duplicates OK).
smartins
you could link the table from the customers mdb to the email mdb. Then the table looks, and acts, like it is part email database.
Matthew
A: 

I would use WHERE Subscriber_Email IN (Email, Email0) as the WHERE clause

SQLRemoveDupes = "DELETE FROM Newsletter_Subscribers WHERE EXISTS " & _ 
(select * from [" & strDBPath & "Customers].Customers where Subscriber_Email IN (Email, EmailO)"

I have found from experience that using an OR predicate in a WHERE clause can be detrimental in terms of performance because SQL will have to evaluate each clause separately, and it might decide to ignore indexes and use a table scan. Sometime it can be better to split it into two separate statements. (I have to admit I am thinking in terms of SQL Server here, but the same may apply to Access)

"DELETE FROM Newsletter_Subscribers WHERE EXISTS " & _ 
    (select * from [" & strDBPath & "Customers].Customers where Subscriber_Email = Email)"

"DELETE FROM Newsletter_Subscribers WHERE EXISTS " & _ 
    (select * from [" & strDBPath & "Customers].Customers where Subscriber_Email = EmailO)"
Tim C
A: 

I would try splitting this into two separate statements with separate database connections.

First, fetch the list of email addresses or IDs in the first database (as a string).

Second, construct a WHERE NOT IN statement and run it on the second database.

I would imagine this would be much faster as it does not have to interoperate between the two databases. The only possible issue would be if there are thousands of records in the first database and you hit the maximum length of a sql query string (whatever that is).

Here are some useful functions for this:

function GetDelimitedRecordString(sql, recordDelimiter)
 dim rs, str
 set rs = db.execute(sql)
 if rs.eof then
  str = ""
  else
  str = rs.GetString(,,,recordDelimiter)
   str = mid(str, 1, len(str)-len(recordDelimiter))
 end if
 rs.close
 set rs = nothing
 GetDelimitedRecordString = str
end function

function FmtSqlList(commaDelimitedStringOrArray)
 ' converts a string of the format "red, yellow, blue" to "'red', 'yellow', 'blue'"
 ' useful for taking input from an html form post (eg a multi-select box or checkbox group) and using it in a SQL WHERE IN clause
 ' prevents sql injection
 dim result:result = ""
 dim arr, str
 if isArray(commaDelimitedStringOrArray) then
  arr = commaDelimitedStringOrArray
 else
  arr = split(commaDelimitedStringOrArray, ",")
 end if
 for each str in arr
  if result<>"" then result = result & ", "
  result = result & "'" & trim(replace(str&"","'","''")) & "'"
 next
 FmtSqlList = result
end function
mike nelson