I need to find a way to find a way to find the hash for the base64 encoded data in the XML node //note/resource/data, or somehow otherwise match it to the hash value in the node //note/content/en-note//en-media@hash
See below for the full XML file
Please suggest a way to {obtain|match} using XSLT
4aaafc3e14314027bb1d89cf7d59a06c
{from|with}
R0lGODlhEAAQAPMAMcDAwP/crv/erbigfVdLOyslHQAAAAECAwECAwECAwECAwECAwECAwECAwEC
AwECAyH/C01TT0ZGSUNFOS4wGAAAAAxtc09QTVNPRkZJQ0U5LjAHgfNAGQAh/wtNU09GRklDRTku
MBUAAAAJcEhZcwAACxMAAAsTAQCanBgAIf8LTVNPRkZJQ0U5LjATAAAAB3RJTUUH1AkWBTYSQXe8
fQAh+QQBAAAAACwAAAAAEAAQAAADSQhgpv7OlDGYstCIMqsZAXYJJEdRQRWRrHk2I9t28CLfX63d
ZEXovJ7htwr6dIQB7/hgJGXMzFApOBYgl6n1il0Mv5xuhBEGJAAAOw==
This sample XML file has obviously been trimmed for brevity/simplicity. The actual may contain > 1 image per note, therefore the need to obtain/match hashes.
The XML file:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE en-export SYSTEM "http://xml.evernote.com/pub/evernote-export.dtd">
<en-export export-date="20091029T063411Z" application="Evernote/Windows" version="3.0">
<note>
<title>A title here</title>
<content><![CDATA[
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE en-note SYSTEM "http://xml.evernote.com/pub/enml.dtd">
<en-note bgcolor="#FFFFFF">
<p>Some text here (followed by the picture)
<p><en-media hash="4aaafc3e14314027bb1d89cf7d59a06c" type="image/gif" border="0" width="16" height="16" alt="A picture"/></p>
<p>Some more text here (preceded by the picture)
</en-note>
]]></content>
<created>20090925T063154Z</created>
<note-attributes>
<author/>
</note-attributes>
<resource>
<data encoding="base64">
R0lGODlhEAAQAPMAMcDAwP/crv/erbigfVdLOyslHQAAAAECAwECAwECAwECAwECAwECAwECAwEC
AwECAyH/C01TT0ZGSUNFOS4wGAAAAAxtc09QTVNPRkZJQ0U5LjAHgfNAGQAh/wtNU09GRklDRTku
MBUAAAAJcEhZcwAACxMAAAsTAQCanBgAIf8LTVNPRkZJQ0U5LjATAAAAB3RJTUUH1AkWBTYSQXe8
fQAh+QQBAAAAACwAAAAAEAAQAAADSQhgpv7OlDGYstCIMqsZAXYJJEdRQRWRrHk2I9t28CLfX63d
ZEXovJ7htwr6dIQB7/hgJGXMzFApOBYgl6n1il0Mv5xuhBEGJAAAOw==
</data>
<mime>image/gif</mime>
<resource-attributes>
<file-name>clip_image001.gif</file-name>
</resource-attributes>
</resource>
</note>
</en-export>
Implemented solution
Using concept of the solution suggested by Jackem. The main difference is that I avoid creating my own Java class (and creating an extra dependency). I do the processing within the XSLT, since it's straight forward enough, only referencing external dependencies that come with the basic Java libraries.
Jackem's solution is more correct because it doesn't lose the leading zero in some hashes, however I found that it was much easier to take care of this elsewhere using li'l basic hackery.
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
...
xmlns:md5="java.security.MessageDigest"
xmlns:bigint="java.math.BigInteger"
exclude-result-prefixes="md5 bigint">
...
<xsl:for-each select="resource">
<xsl:variable name="md5inst" select="md5:getInstance('MD5')" />
<xsl:value-of select="md5:update($md5inst, $b64bin)" />
<xsl:variable name="imgmd5bytes" select="md5:digest($md5inst)" />
<xsl:variable name="imgmd5bigint" select="bigint:new(1, $imgmd5bytes)" />
<xsl:variable name="imgmd5str" select="bigint:toString($imgmd5bigint, 16)" />
<!-- NOTE: $imgmd5str loses the leading zero from imgmd5bytes (if there is one) -->
</xsl:for-each>
...
P.S. see sibling question for my implementation of of the base64-->image file
conversion
This question is a subquestion of another question I have asked previously.