views:

560

answers:

1

This might sound minor, but it's been driving me nuts. Since releasing an application to production last Friday on Ruby 1.9, I've been having lots of minor exceptions related to character encodings. Almost all of it is some variation on:

Encoding::CompatibilityError: incompatible character encodings: ASCII-8BIT and UTF-8

We have an international user base so plenty of names contain umlauts, etc. If I fix the templates to use force_encoding in a bunch of places, it pops up in the flash message helper. Et cetera.

At the moment it looks like I've nailed down all the ones I knew about, by patching ActiveSupport's string concatenation in one place and then by setting # encoding: utf-8 at the top of every one of my source files. But the feeling that I might have to remember to do that for every file of every Ruby project I ever do from now on, forever, just to avoid string assignment problems, does not sit well in my stomach. I read about the -Ku switch but everything seems to warn that it's for backwards compatibility and might go away at any time.

So my question for 1.9-experienced folks: is setting #encoding in every one of my files really necessary? Is there a reasonable way to do this globally? Or, better, a way to set the default encoding on non-literal values of strings that bypass the internal/external defaults?

Thanks in advance for any suggestions.

+1  A: 

http://zargony.com/2009/07/24/ruby-1-9-and-file-encodings

Don't confuse file encoding and string encoding!

Trevoke
Thanks Trevoke; I do know the difference. However, strings inherit the encoding of the source file in which they're created. (Unless they come from an IO operation on another file; hence the default_internal and default_external properties.) So while they're not the same, they're deeply and frustratingly related. What I want is a way to set the default _string_ encoding without having to use that `#encoding` comment.
SFEley
Everything you -ever- wanted to know about encodings:http://blog.grayproductions.net/categories/character_encodingsAnd probably more that you hoped never to learn :)
Trevoke