views:

5254

answers:

6

I'm looking for an efficient way of searching through a dataset to see if an item exists. I have an arraylist of ~6000 items and I need to determine which of those doesn't exist in the dataset by comparing each item within the arraylist with data in a particular column of the dataset.

I attempted to loop through each item in the dataset for each in the arraylist but that took forever. I then attempted to use the RowFilter method below. None of which looks to be efficient. Any help is greatly appreciated, as you can tell I'm not much of a programmer...

example:

Dim alLDAPUsers As ArrayList
alLDAPUsers = clsLDAP.selectAllStudents

Dim curStu, maxStu As Integer
maxStu = alLDAPUsers.Count

For curStu = 0 To maxStu - 1
     Dim DomainUsername As String = ""
     DomainUsername = alLDAPUsers.Item(curStu).ToString

     Dim filteredView As DataView
     filteredView = dsAllStudents.Tables(0).DefaultView
     filteredView.RowFilter = ""
     filteredView.RowFilter = "szvausr_un = '" & DomainUsername & "'"

     Dim returnedrows As Integer = filteredView.Count
     If returnedrows = 0 Then
          '' Delete the user...
     End If
Next
+1  A: 

Try switching your array list to Generics. From what I understand they are much faster than an array list.

Here is a previous SO on Generics vs Array List

Brian Boatright
+3  A: 

You can get better performance by Sorting the list and ordering the dataset. Then you can walk them together, matching as you go. This is especially true since you are probably already ordering the dataset at least (or, you should be) in the sql query that creates it, making that step essentially free.

You should consider using a generic list rather than an ArrayList, and some other stylistic points on your existing code:

Dim LDAPUsers As List(Of String) = clsLDAP.selectAllStudents

For Each DomainUsername As String in LDAPUsers
     Dim filteredView As DataView = dsAllStudents.Tables(0).DefaultView
     filteredView.RowFilter = "szvausr_un = '" & DomainUsername & "'"

     If filteredView.Count = 0 Then
      '' Delete the user...
     End If
Next

This does the same thing as your original snippet, but in half the space so it's much cleaner and more readable.

Joel Coehoorn
This worked great. I'm curious to try out the suggestion of using generics and LDAPUsers.Except(AllStudents) and comparing the processing time. All in all, this was my first post and I couldn't be more impressed. Thanks!
cnynetadmin
Why would an order operation ever be free?
When the items are already in order.
Joel Coehoorn
A: 

Get the Dim statements out of the loop.... Your performance is suffering from repeated variable instantiation and reallocation.

Also remove any statements you dont need (rowfilter = "")

Dim alLDAPUsers As ArrayList
Dim DomainUsername As String
Dim curStu, maxStu As Integer
Dim filteredView As DataView
Dim returnedrows As Integer

alLDAPUsers = clsLDAP.selectAllStudents
maxStu = alLDAPUsers.Count

For curStu = 0 To maxStu - 1
     DomainUsername = alLDAPUsers.Item(curStu).ToString


     filteredView = dsAllStudents.Tables(0).DefaultView
     filteredView.RowFilter = "szvausr_un = '" & DomainUsername & "'"

     returnedrows  = filteredView.Count
     If returnedrows = 0 Then
          '' Delete the user...
     End If
Next
StingyJack
I'm sure it's something I missed in the documentation, but without calling the rowfilter = "" before the statement, my dataset that I'm searching only contains the value from the previous filter.
cnynetadmin
+2  A: 

If you use generics as suggested, you can have two Lists of string and do the following:

for each s as string in LDAPUsers.Except(AllStudents)
    ''Delete the user (s)
next

Where LDAPUsers and AllStudents are both List(Of String)

Edit:

You can also change the except to:

LDAPUsers.Except(AllStudents, StringComparer.InvariantCultureIgnoreCase)

to ignore case etc.

Edit 2:

To get the generic lists could be as simple as:

Dim LDAPUsers as new List(Of String)(alLDAPUsers.Cast(Of String))
Dim AllStudents as new List(OfString)()

for each dr as DataRow in dsAllStudents.Tables(0).Rows
    AllStudents.Add(dr("szvausr_un"))
next

Or you can go with the Linq-y goodness as Joel mentions, but my exposure to that is limited, unfortunately...

Carl
You might be able to do some linq-y stuff with the dataset, too.
Joel Coehoorn
Yeah, good point.
Carl
A: 

Like others have said, generics or linq would be better options. However, I wanted to point out that you don't need to use a DataView. The datatable has a Select method...

dsAllStudents.Tables(0).Select("szvausr_un = '" & DomainUserName & "'")

It returns an array of DataRows. I'm sure it will perform just as poorly as the view but I think it's a little cleaner.

whatknott
A: 

refer a detailed dataset help

http://vb.net-informations.com/dataset/vb.net-ado.net-dataset-tutorial.htm

bolton.

bolton