+1  A: 

Rather than a rails/mongrel problem, it sounds more likely that there's an issue either with your XML file or with the way REXML handles it. You can check this by writing a short script to read your XML file directly (rather than within a request) and seeing if it still fails.

Assuming it does, there are a couple of things I'd look at. First, I'd check you are running the latest version of REXML. A couple of years ago there was a bug (http://www.germane-software.com/projects/rexml/ticket/63) in its UTF-16 handling.

The second thing I'd check is if you're issue is similar to this: http://groups.google.com/group/rubyonrails-talk/browse_thread/thread/ba7b0585c7a6330d. If so you can try the workaround in that thread.

If none of the above helps, then please reply with more information, such as the exception you are getting when you try and read the file.

tomafro
Thanks for your response, I'll try the things you suggested, and repost if I don't get anywhere
Matt Haughton
A: 

Since getting this to work requires me to only change the encoding attribute of the first XML element to have the value UTF-8 instead of UTF-16, the XML file is actually UTF-8 and labelled wrongly by the application that generates it.

The XML file is a FileMaker DDR export produced by FileMaker Pro Advanced 8.5 on OS X 10.5.4

Matt Haughton
See my answer below. I had a similar problem.
George Stocker
A: 

Have you tried doing this using JRuby? I've heard Unicode strings are better supported in JRuby.

One other thing you can try is to use another XML parsing library, such as libxml ou Hpricot.

REXML is one of the slowest Ruby XML libraries you can use and might not scale.

Dema
A: 

Actually, I think your problem may be related to the problem I just detailed in this post. If I were you, I'd open it up in TextPad in Binary mode and see if there are any Byte Order Marks before your XML starts.

George Stocker
Thanks very much, I'll look into that
Matt Haughton