tags:

views:

1138

answers:

2

I'm using VB .NET and I know that Union normally works ByRef but in VB, Strings are generally processed as if they were primitive datatypes.

Consequently, here's the problem:

Sub Main()
    Dim firstFile, secondFile As String(), resultingFile As New StringBuilder

    firstFile = My.Computer.FileSystem.ReadAllText(My.Computer.FileSystem.SpecialDirectories.Desktop & "\1.txt").Split(vbNewLine)
    secondFile = My.Computer.FileSystem.ReadAllText(My.Computer.FileSystem.SpecialDirectories.Desktop & "\2.txt").Split(vbNewLine)

    For Each line As String In firstFile.Union(secondFile)
        resultingFile.AppendLine(line)
    Next

    My.Computer.FileSystem.WriteAllText(My.Computer.FileSystem.SpecialDirectories.Desktop & "\merged.txt", resultingFile.ToString, True)
End Sub

1.txt contains:
a
b
c
d
e

2.txt contains:
b
c
d
e
f
g
h
i
j

After running the code, I get: a
b
c
d
e
b
f
g
h
i
j

Any suggestions for making the Union function act like its mathematical counterpart?

+1  A: 

I think you want to use the Distinct function. At then end of your LINQ statement do .Distinct();

var distinctList = yourCombinedList.Distinct();

Similar to a 'SELECT DISTINCT' in SQL :)

Kelsey
Distinct shouldn't be required after Linq's Union method
Robert Paulson
You are correct, I should have said, 'combined' list in my example. Updating it to reflect that thanks.
Kelsey
+8  A: 

Linq Union does perform as you want it to. Ensure your input files are correct (e.g. one of the lines may contain a space before the newline) or Trim() the strings after splitting?

var list1 = new[] { "a", "s", "d" };
var list2 = new[] { "d", "a", "f", "123" };
var union = list1.Union(list2);
union.Dump(); // this is a LinqPad method

In linqpad, the result is {"a", "s", "d", "f", "123" }

Robert Paulson
You found the problem. Thanks so much!
Zian Choy