views:

219

answers:

1

I'm writing a C# DKIM validator and have come across a problem that I cannot solve. Right now I am working on calculating the body hash, as described in Section 3.7 Computing the Message Hashes. I am working with emails that I have dumped using a modified version of EdgeTransportAsyncLogging sample in the Exchange 2010 Transport Agent SDK. Instead of converting the emails when saving, it just opens a file based on the MessageID and dumps the raw data to disk.

I am able to successfully compute the body hash of the sample email provided in Section A.2 using the following code:

SHA256Managed hasher = new SHA256Managed();
ASCIIEncoding asciiEncoding = new ASCIIEncoding();
string rawFullMessage = File.ReadAllText(@"C:\Repositories\Sample-A.2.txt");
string headerDelimiter = "\r\n\r\n";
int headerEnd = rawFullMessage.IndexOf(headerDelimiter);
string header = rawFullMessage.Substring(0, headerEnd);
string body = rawFullMessage.Substring(headerEnd + headerDelimiter.Length);
byte[] bodyBytes = asciiEncoding.GetBytes(body);
byte[] bodyHash = hasher.ComputeHash(bodyBytes);
string bodyBase64 = Convert.ToBase64String(bodyHash);
string expectedBase64 = "2jUSOH9NhtVGCQWNr9BrIAPreKQjO6Sn7XIkfJVOzv8=";
Console.WriteLine("Expected hash: {1}{0}Computed hash: {2}{0}Are equal: {3}",
  Environment.NewLine, expectedBase64, bodyBase64, expectedBase64 == bodyBase64);

The output from the above code is:

Expected hash: 2jUSOH9NhtVGCQWNr9BrIAPreKQjO6Sn7XIkfJVOzv8=
Computed hash: 2jUSOH9NhtVGCQWNr9BrIAPreKQjO6Sn7XIkfJVOzv8=
Are equal: True

Now, most emails come across with the c=relaxed/relaxed setting, which requires you to do some work on the body and header before hashing and verifying. And while I was working on it (failing to get it to work) I finally came across a message with c=simple/simple which means that you process the whole body as is minus any empty CRLF at the end of the body. (Really, the rules for Body Canonicalization are quite ... simple.)

Here is the real DKIM email (right click and save it, the browsers eat the ending CRLF) with a signature using the simple algorithm (completely unmodified). Now, using the above code and updating the expectedBase64 hash I get the following results:

Expected hash: VnGg12/s7xH3BraeN5LiiN+I2Ul/db5/jZYYgt4wEIw=
Computed hash: ISNNtgnFZxmW6iuey/3Qql5u6nflKPTke4sMXWMxNUw=
Are equal: False

The expected hash is the value from the bh= field of the DKIM-Signature header. Now, the file used in the second test is a direct raw output from the Exchange 2010 Transport Agent. If so inclined, you can view the modified EdgeTransportLogging.txt.

At this point, no matter how I modify the second email, changing the start position or number of CRLF at the end of the file I cannot get the files to match. What worries me is that I have been unable to validate any body hash so far (simple or relaxed) and that it may not be feasible to process DKIM through Exchange 2010.

+1  A: 

I tried this in python-dkim and I get a body hash mismatch too.

I think probably Exchange's GetMimeReadStream is not giving you the actual bytes as they were transmitted, therefore the hash doesn't match. Probably it's disassembling the message into its mime parts, and then GetMimeReadStream gives you a valid representation of the message, but not the one it was originally sent with.

Perhaps there's another API that will give you the real raw bytes?

Or perhaps by this point in the process the message has been torn apart and the original message thrown away, and you need to hook in earlier.

Probably you should try intercepting a DKIM-signed message by sending it to a non-Exchange server, and see if that works with your code. GetContentReadStream might possibly work?

Anyhow, what I would do next is try to find an API that gives you byte-for-byte what was sent.

poolie
You nailed it on the head. As far as I can tell I am getting a "mostly reassembled" mail message with enough variations to prevent it from matching up as it should. While I haven't scoured everything it doesn't look like they provide a way to grab the raw bytes as it comes in either at least not managed. It looks like the only solution (for now) is to implement an intermediary SMTP server (like IIS) that you can hook into and process it there.
Joshua
Glad I could help.
poolie