I am in the middle of making a script for doing translation of xml documents. It's actually pretty cool, the idea is (and it is working) to take an xml file (or a folder of xml files) and open it, parse the xml, get whatever is in between some tags and using the google translate api's translate it and replace the content of the xml files.
As I said, I have this working, but only in fairly strict xml formatted documents, now I have to make it compatible with documents formatted differently. So my idea was:
Parse the xml, find a node, e.g:
<template>lorem lipsum dolor mit amet<think><set name="she">Ada</set></think></template>
Save this as a string, do some regex search and replace on this string. But i sadly have no clue on how to proceed. I want to search to the string (xml node) find text that is inbetween tags, in this case "lorem lipsum dolor mit amet" and "Ada", call a function with those text's as a parameter and then insert the result of the function in the same place as it originated from.
The reason i cant just get the text and rebuild the xml formatting is that there will be differently formatted xml nodes so i need it to be identical...