views:

107

answers:

3

I have a Rails project with lots and lots of cyrillic strings in it.

It worked fine on Ruby 1.8, but Ruby 1.9 recognizes all source files as US-ASCII-encoded until you provide an # encoding: utf-8 comment on top of each and every source file in the project. Obviously the files don't parse as US-ASCII.

Is there a simpler way to say, literally, "This application is UTF8-encoded. Please consider all and any included source files as UTF8 unless declared otherwise"?

+2  A: 

Explicit is better than implicit. Writing out the name of the encoding is good for your text editor, your interpreter, and anyone else who wants to look at the file. Different platforms have different defaults -- UTF-8, Windows-1252, Windows-1251, etc. -- and you will either hamper portability or platform integration if you automatically pick one over the other. Requiring more explicit encodings is a Good Thing.

It might be a good idea to integrate your Rails app with GetText. Then all of your UTF-8 strings will be isolated to a small number of translation files, and your Ruby modules will be clean ASCII.

jleedev
A: 

I don't run into this much, but when I need to ensure UTF-8, I use the $KCODE global. Try putting this in your environment.rb: $KCODE = 'UTF8'

Also, are you certain that your editor is saving files in UTF-8?

Brian
KCODE does not affect source parsing, afaik.
Leonid Shevtsov
+2  A: 

I think you can either

  1. use -E utf-8 command line argument to ruby, or
  2. set your RUBYOPT environment variable to "-E utf-8"
Mladen Jablanović