views:

136

answers:

2

This is related to

http://stackoverflow.com/questions/3419573/excel-vba-remove-duplicate-rows-by-cross-referencing-2-different-sheets-then-de

I can't seem to get any VBA to work well or fast enough for a couple 100 rows.

Does Excel have a formula to remove duplicates from one sheet, by cross referencing another sheet?

Thanks for all your help.

A: 

Here is a much faster VBA solution, utilizing a dictionary object. As you can see, it loops only once through sheet A and sheet B, while your original solution has a running time proportional to "number of rows in sheet A" * "number of rows in sheet B".

Option Explicit
Sub CleanDupes()
    Dim wsA As Worksheet
    Dim wsB As Worksheet
    Dim keyColA As String

    Dim keyColB As String
    Dim rngA As Range
    Dim rngB As Range
    Dim intRowCounterA As Integer
    Dim intRowCounterB As Integer
    Dim strValueA As String


    keyColA = "A"
    keyColB = "B"

    intRowCounterA = 1
    intRowCounterB = 1

    Set wsA = Worksheets("Sheet A")
    Set wsB = Worksheets("Sheet B")

    Dim dict As Object
    Set dict = CreateObject("Scripting.Dictionary")

    Do While Not IsEmpty(wsA.Range(keyColA & intRowCounterA).Value)
        Set rngA = wsA.Range(keyColA & intRowCounterA)
        strValueA = rngA.Value
        If Not dict.Exists(strValueA) Then
            dict.Add strValueA, 1
        End If
        intRowCounterA = intRowCounterA + 1
    Loop

    intRowCounterB = 1
    Do While Not IsEmpty(wsB.Range(keyColB & intRowCounterB).Value)
        Set rngB = wsB.Range(keyColB & intRowCounterB)
        If dict.Exists(rngB.Value) Then
             wsB.Rows(intRowCounterB).Delete
             intRowCounterB = intRowCounterB - 1
        End If
        intRowCounterB = intRowCounterB + 1
    Loop
End Sub
Doc Brown
Thanks Doc! Wow that was faster.
Ehsan
A: 

You can do a lot with ADO and Excel.

Dim cn As Object
Dim rs As Object
Dim wb As Workbook
Dim sSQL As String
Dim sFile As String
Dim sCon As String
Dim sXLFileToProcess As String
Dim i

sXLFileToProcess = "Book1z.xls"

sFile = Workbooks(sXLFileToProcess).FullName

''Note that if HDR=No, F1,F2 etc are used for column names,
''if HDR=Yes, the names in the first row of the range
''can be used.
''This is the Jet 4 connection string, you can get more
''here : http://www.connectionstrings.com/excel

sCon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & sFile _
    & ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";"

''Late binding, so no reference is needed

Set cn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")

cn.Open sCon

'' In this example, the column header for column F is F, see notes
'' above on field (column) names. It also assumes that the sheets to
'' be merged have the same column headers in the same order
'' It would be safer to list the column heards rather than use *.

sSQL = sSQL & "SELECT b.Key,b.b,b.c,b.d,b.e FROM [SheetB$] As B " _
            & "LEFT JOIN [SheetA$] As A " _
            & "ON B.Key=A.Key " _
            & "WHERE A.Key Is Null"

rs.Open sSQL, cn, 3, 3

Set wb = Workbooks.Add

With wb.Worksheets("Sheet1")
    For i = 1 To rs.Fields.Count
        .Cells(1, i) = rs.Fields(i - 1).Name
    Next

    .Cells(2, 1).CopyFromRecordset rs
End With

''Tidy up
rs.Close
Set rs = Nothing
cn.Close
Set cn = Nothing
Remou
This sounds amazing Remou but I'm not sure what you're doing. Are you doing a sql query to get the data or using SQL to access data from 2 sheets (for example doing a LINQ query on a list in C#). Please explain the purpose of the sql part, I'm getting excited!
Ehsan
You can treat a sheet or a named range as a table for SQL purposes in ADO. In other words, any query that you can perform with Jet can be run against an Excel sheet. It is quite interesting. If I can help, please ask further.
Remou
The SQL in the example joins the two sheets on a key field, selecting only records that appear on SheetB and not on SheetA (WHERE A.Key Is Null) and writes these records to a new workbook with CopyFromRecordset. This does not write headers, so there is a little loop to do that.
Remou
Awesome. So how do I use Jet, is it something I have to download or what you provided was sufficient using a connection string. Also, would this be portable or would someone have to have access to the internet and maybe download the Jet Plugin. Thanks.
Ehsan
Jet is installed with Windows and has been since at least XP, so no downloads, no problems, with any luck :)
Remou
Here are you connection strings, in case you need more than suggested above: http://www.connectionstrings.com/excel
Remou