views:

404

answers:

3

I have a very interesting LINQ question. I have a document, that I am trying to filter results on, but to filter, I am matching on a REGEX result from one element of the XML.

I have the following, working LINQ to XML to get the individual data that I'm looking for.

Dim oDocument As XDocument
oDocument = XDocument.Load("test.xml")
Dim results = (From x In oDocument.Descendants.Elements("ROW") _
   Select New With {.ApplicationName = GetApplicationName(x.Element("Message")), _
    .EventId = x.Element("EventId")}).Distinct

However, the .Distinct doesn't do what I want, it still shows all combinations of "ApplicationName" and "EventId".

What I need in the end is a distinct list of results, in a new object with the application name, and event id from the XML.

The "GetAPplicationName" is a function that parses the value looking for a regex match.

Any pointers?

Sample XML

<ROOT>
  <ROW>
    <EventId>1</EventId>
    <CreatedTimestamp>2009-10-28</CreatedTimestamp>
    <Message>There is a bunch
    of 
  garbled
inforamtion here and I'm trying to parse out a value 
Virtual Path: /MyPath then it continues on with more junk after the
message, including extra stuff
    </Message>
    <!--Other elements removed for brevity -->
  </ROW>
  <ROW>
    <EventId>1</EventId>
    <CreatedTimestamp>2009-10-28</CreatedTimestamp>
    <Message>
      There is a bunch
      of
      garbled
      inforamtion here and I'm trying to parse out a value
      Virtual Path: /MyPath then it continues on with more junk after the
      message, including extra stuff
    </Message>
    <!--Other elements removed for brevity -->
  </ROW>
</ROOT>

From here I want the distinct /MyPath and EventId (In this case 1 entry with /MyPath and 1.

My GetApplicationNameMethod, on this sample will return /MyPath

+1  A: 

Distinct doesn't know how to compare your items, so it returns all the items unfiltered. You should use the Distinct overload that implements IEqualityComparer. This would allow you to compare the ApplicationName and EventId properties to determine equality. However, doing so would mean having a real class, not an anonymous type. The documentation demonstrates how to achieve this in an easy to understand manner.

EDIT: I was able to use your sample with the IEqualityComparer and an EventInfo class. I added my own implementation of GetApplicationName to test.

    Dim results = (From x In doc.Descendants.Elements("ROW") _
       Select New EventInfo With {.ApplicationName = GetApplicationName(x.Element("Message")), _
        .EventId = x.Element("EventId")})

    Console.WriteLine("Total: {0}", results.Count)
    Console.WriteLine("Distinct Total: {0}", results.Distinct.Count)
    Console.WriteLine("Distinct (w/comparer) Total: {0}", results.Distinct(New EventInfoComparer()).Count)

This outputs:

Total: 2
Distinct Total: 2
Distinct (w/comparer) Total: 1

The rest of the code:

' EventInfo class and comparer
Private Function GetApplicationName(ByVal element As XElement)
    Return Regex.Match(element.Value, "Virtual\sPath:\s/(\w+)").Groups(1).Value
End Function

Public Class EventInfo
    Private _applicationName As String
    Public Property ApplicationName() As String
        Get
            Return _applicationName
        End Get
        Set(ByVal value As String)
            _applicationName = value
        End Set
    End Property

    Private _eventId As Integer
    Public Property EventId() As Integer
        Get
            Return _eventId
        End Get
        Set(ByVal value As Integer)
            _eventId = value
        End Set
    End Property
End Class

Public Class EventInfoComparer
    Implements IEqualityComparer(Of EventInfo)

    Public Function Equals1(ByVal x As EventInfo, ByVal y As EventInfo) As Boolean _
        Implements IEqualityComparer(Of EventInfo).Equals

        ' Check whether the compared objects reference the same data.
        If x Is y Then Return True

        ' Check whether any of the compared objects is null.
        If x Is Nothing OrElse y Is Nothing Then Return False

        ' Check whether the EventInfos' properties are equal.
        Return (x.ApplicationName = y.ApplicationName) AndAlso (x.EventId = y.EventId)
    End Function

    Public Function GetHashCode1(ByVal eventInfo As EventInfo) As Integer _
        Implements IEqualityComparer(Of EventInfo).GetHashCode

        ' Check whether the object is null.
        If eventInfo Is Nothing Then Return 0

        ' Get the hash code for the ApplicationName field if it is not null.
        Dim hashEventInfoAppName = _
            If(eventInfo.ApplicationName Is Nothing, 0, eventInfo.ApplicationName.GetHashCode())

        ' Get the hash code for the EventId field.
        Dim hashEventInfoId = eventInfo.EventId.GetHashCode()

        ' Calculate the hash code for the EventInfo.
        Return hashEventInfoAppName Xor hashEventInfoId
    End Function
End Class
Ahmad Mageed
@Ahmad - Awesome that worked!
Mitchel Sellers
For the record, you don't need to use an IEqualityComparer if you override Equals() and GetHashcode() in EventInfo (or make EventInfo implement IEquatable(Of T) as I suggested).
dahlbyk
Not sure why I got downvoted for this. @Mitchel: glad it helped! @dahlbyk: the Distinct method's overload made this come to mind first, that's all. I helped a coworker with this the other day since the class couldn't be modified directly.
Ahmad Mageed
+1  A: 

I had never noticed this before, but it seems VB's anonymous types do not override Equals() and GetHashcode(). As such, Distinct() is checking reference equality. The easiest workaround is to build your own class that implements IEquatable<T>.

dahlbyk
+1  A: 

Ahmad's answer is correct and I upvoted it. However, I just wanted to point out the alternative VB.NET specific syntax for your LINQ to XML query.

Dim results = From x In doc...<ROW> _
   Select New EventInfo With {.ApplicationName = GetApplicationName(x.<Message>.Value, _
    .EventId = x.<EventId>.Value}

This returns the same result, but if you import an xmlns, then you'll get IntelliSense for the element names this way.

Here's an article describing how to get XML IntelliSense in VB.NET:

http://msdn.microsoft.com/en-us/library/bb531325.aspx

Dennis Palmer
Thanks for the info, I'm trying to figure out what I think of that syntax....(Still a bit rusty on the VB syntax in general, I'm more of a C# guy typically..)
Mitchel Sellers