I'm currently playing a bit with couchdb.
I'm trying to migrate some blog data from redis (key value store) to couchdb (key value store).
Seeing as I probably migrated this data a gazillion times from and to different blogging engines (everybody has got to have a hobby :) ), there seem to be some encoding snafus.
I'm using CouchREST to access CouchDB from ruby and I'm getting this:
<JSON::GeneratorError: source sequence is illegal/malformed>
the problem seems to be the body_html part of the object:
<Post:0x00000000e9ee18 @body_html="[.....]Wie Sie bereits wissen, m\xF6chte EUserv k\xFCnftig seine [...]
Those are supposed to be Umlauts ("möchte" and "künftig").
Any idea how to get rid of those problems? I tried some conversions using the ruby 1.9 encoding feature or iconv before inserting, but haven't got any luck yet :(
If I try to e.g. convert that stuff to ISO-8859-1 using the .encode() method of ruby 1.9, this is what happens (different text, same problem):
#<Encoding::UndefinedConversionError: "\xC6\x92" from UTF-8 to ISO-8859-1>