views:

33

answers:

1

I know that in Ruby 1.9 you can easily re-encode a string like this.

s = s.encode('UTF-8')

What is the equivalent in Ruby 1.8? What require lines does it need.

All the tutorials I have seen are needlessly complicated and I don't understand what is going on.

Cheers!

+1  A: 

James Edward Gray II has a detailed collections of posts dealing with encoding and character set issues in Ruby 1.8. The post entitled Encoding Conversion with iconv contains detailed information.

Summary: the iconv gem does all the work of converting encodings. Make sure it's installed with:

gem install iconv

Now, you need to know what encoding your string is currently in as Ruby 1.8 treats Strings as an array of bytes (with no intrinsic encoding.) For example, say your string was in latin1 and you wanted to convert it to utf-8

require 'iconv'

string_in_utf8_encoding = Iconv.conv("UTF8", "LATIN1", string_in_latin1_encoding)

The order of arguments is:

  1. Target encoding
  2. Source encoding
  3. String to convert
rjk
What can I do if I am not sure of the encoding for the initial string? Is there any way of detecting it?
The Warm Jets
In general? No. If the incoming encoding possibilities is limited, you _might_ be able to use some sort of heuristic, but it would not be completely accurate or reliable (becoming more unreliable as the number of possible encodings increased.)
rjk
Cheers. I guess because it is input from an SQL field I can assume it is this type of character encoding.
The Warm Jets
That's a good assumption if you control the database (or at least know who does control it.) Please mark the answer as the accepted answer if you found it helpful. Thanks.
rjk