views:

2824

answers:

6

I have a sheet full of some raw data, about 20,000 rows and 50 columns. (Number of rows expected to increase, probably double if not triple)

I need a formula to look at this data and determine if a row exists for data in two specified columns. My current formula is as follows.

Function CheckExists(Table As Range, SearchCol1 As Integer, SearchVal1 As Variant, SearchCol2 As Integer, SearchVal2 As Variant)

    Dim i As Long
    Dim exists As Boolean

    exists = False

    For i = 1 To Table.Rows.Count
        If Table.Cells(i, SearchCol1) = SearchVal1 Then
            If Table.Cells(i, SearchCol2) = SearchVal2 Then
                exists = True
                Exit For
            End If
        End If
    Next i

    CheckExists = exists

End Function

I run this formula from another sheet, with about 5000 rows.

My problem is, this kills my pc, it takes ages to calculate the cells. I'm hoping someone can offer some suggestions on how to make this faster or even better, a built in formula that can do what I'm after.

+2  A: 

If you use Cells.Find it'll save you a lot of iterating over cells that never contain the value you want.

e.g.

Cells.Find(What:="-1", After:=ActiveCell, LookIn:=xlValues, LookAt:= _
      xlWhole, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=False _
      , SearchFormat:=False).Activate

will activate the cell that it finds the value in. You can then check to see if the other cell you're comparing has the correct value (if the cell was located).

You may also have to check that the row/column you're in is the correct one for the value you're looking for, especially if you're searching for something like the number 20, it will locate the number 20 in the wrong row.

Have a play, I think you'll find it's quite a bit faster than normal iteration.

Additional, possibly unwarranted advice: with that much data, would a database not be useful?

Update:
I've had a quick play - this function can replace the existing one you're using above, let me know if it's any faster.

Function FindCheckExists(Table As Range, SearchCol1 As Integer, _
                         SearchVal1 As Variant, SearchCol2 As Integer, _
                         SearchVal2 As Variant)

    Dim i As Long
    Dim exists As Boolean
    Dim result As Range

    exists = False
    Do While (Not exists)

        Set result = Cells.Find(What:=SearchVal1, After:=Cells(1, SearchCol1), _
          LookIn:=xlValues, LookAt:=xlWhole, SearchOrder:=xlByRows, _
          SearchDirection:=xlNext, MatchCase:=False, SearchFormat:=False)

        If (result Is Nothing) Then Exit Function

        If result.Offset(0, (SearchCol2 - SearchCol1)).Value = SearchVal2 Then _
            exists = True

    Loop

    FindCheckExists = exists

End Function

The above has been reformatted with _'s to fit neater on the stackoverflow page, it'll probably look neater in Excel without them.

Andy
Thanks, I'll look at Find. We have a database, its pulled from a database, and I can do all of the required stuff in SQL. However, we need to ability to do it without having to resort to SQL. It's more of an empowering the client excercise.
Robin Day
Okay, that makes sense :) Have a look at my updated solution above and let me know if it's any better.
Andy
+3  A: 

You could add a column to the end of the raw data with a simple if formula e.g.

if(AND(A1=4, B1=6), 1, 0)

Where A1 / B1 are the columns to check and 4 and 6 would be replaced by the actual values. The check if a row exists is then just checking the sum of this column:

if(sum(C:C) > 0, TRUE, FALSE)

where C is the column with the formula. On large Excel docs I generally find a composite approach like this to be most efficient. If you post a sample of the data I may be able to refine further.

Macros
+2  A: 

I suggest adding a column to concatenate the values in the two columns of interest then using the MATCH worksheet function to search the new column.

Graham Miller
A: 

I've found it quickest to use autofilter. Filter it based on your criteria, then delete the remaining visible rows.

e.g. Ws.Rows(1).Insert 'Filter does not check first row Ws.AutoFilterMode = False Ws.Cells.AutoFilter searchCol1, "=*" & searchVal1 & "*", , , False Ws.Cells.AutoFilter searchCol2, "=*" & searchVal2 & "*", , , False Ws.AutoFilter.Range.SpecialCells(xlCellTypeVisible).EntireRow.Delete ActiveSheet.AutoFilterMode = False

Hope that helps.

+1  A: 

It's not necessary to use VBA to accomplish this - you can do it with just a regular front-end formula. The formula below (the first just does rows 1-5000, but the second will do the entire column) will determine if the combination in question exists:

=SUMPRODUCT(--(A1:A5000="SearchValue1"),--(B1:B5000="SearchValue2"))

=SUMPRODUCT(--(A:A="SearchValue1"),--(B:B="SearchValue2"))

To use it, you'd just fix "SearchValue1" and "SearchValue2" to be the values you're looking for, and change the column letters if you're not using A and B.

rwmnau
+1  A: 

I'd say much like rwmnau, except I generally use:

=SUMPRODUCT((A1:A5000="SearchValue1")*(B1:B5000="SearchValue2"))  

EDIT If you use Excel 2007, you may use COUNTIFS function (note the S at the end).

iDevlop