I'm running into an issue with extra newlines on windows versus *nix platforms.
file = open('UTF16file.xml', 'rb')
html = file.read().decode('utf-16')
file.close()
regexp = re.compile(self.originalurl, re.S)
(html, changes) = regexp.subn(self.newurl, html)
file = open('UTF16file-regexed.xml', 'w+')
file.write(html.encode('utf-16'))
file.close()
Running this code on my mac works - I get my file back without the extra line breaks. So far I've tried:
Encoding the regular expression as utf-16 instead of decoding the file - breaks on Windows and OSX.
Writing in mode 'wb' instead of 'w+' - breaks on Windows.
Any ideas?