views:

198

answers:

2

I'm trying to write a unit test for a piece of code that generates a large amount of text. I've run into an issue where the "expected" and "actual" strings appear to be equal, but Assert.AreEqual throws, and both the equality operator and Equals() return false. The result of GetHashCode() is different for both values as well.

However, putting both strings into text files and comparing with DiffMerge tells me they're the same.

Additionally, using Encoding.ASCII.GetBytes() on both values and then using SequenceEquals to compare the resulting byte arrays returns true.

The values are 34KB each, so I'll hold off putting them here for now. Any ideas? I'm completely stumped.

+9  A: 

Loop through char by char and find which it thinks is different? The fact that writing it to disk and comparing the ASCII / text tells me that it is probably either carriage-return / line-feed related (which is somehow normalized during save), or relates to some non-ASCII character (maybe a high-unicode whitespace), which will be stripped when saving as ASCII.

Marc Gravell
+5  A: 

What are the encoding types of the files you are feeding into DiffMerge? If you have characters that don't match the encoding type, then there is a chance they won't show up in DiffMerge.

The string that is being generated and the expected result probably have different character encodings. When you are doing ASCII.GetBytes, you are converting everything into ASCII. So, your strings are being converted to ASCII and are equal in terms of the ASCII character set. However, they can still be unequal in other character sets (and still "look" the same to you).

Also, try doing a string.Compare(str1, str2, StringComparison.XXXX) and let us know what happens.

Polaris878
Probably best is to try `StringComparison.Ordinal`.
Martinho Fernandes
Yup, it was an encoding issue... the erroneous text had been copied off a web page and had some crazy quote characters
Daniel Schaffer