I've isolated a problem with Ruby on Rails where a model with a serialized column is not properly loading data that has been saved to it.
What goes in is a Hash, and what comes out is a YAML string that can't be parsed due to formatting issues. I'd expect that a serializer can properly store and retrieve anything you give it, so something appears to have gone wrong.
The troublesome string in question is formatted something like this:
message_text = <<END
X
X
END
yaml = message_text.to_yaml
puts yaml
# =>
# --- |
#
# X
# X
puts YAML.load(yaml)
# => ArgumentError: syntax error on line 3, col 0: ‘X’
The combination of newline, indented second line, and non-indented third line causes the parser to fail. Omitting either the blank line or the indentation appears to remedy the problem, but this does seem to be a bug in the serialization process. Since it requires a rather unique set of circumstances, I'm willing to bet this is some strange edge-case that isn't properly handled.
The YAML module that ships with Ruby and is used by Rails looks to delegate a large portion of the processing to Syck, yet does provide Syck with some hints as to how to encode the data it is sending.
In yaml/rubytypes.rb there's the String#to_yaml definition:
class String
def to_yaml( opts = {} )
YAML::quick_emit( is_complex_yaml? ? self : nil, opts ) do |out|
if is_binary_data?
out.scalar( "tag:yaml.org,2002:binary", [self].pack("m"), :literal )
elsif to_yaml_properties.empty?
out.scalar( taguri, self, self =~ /^:/ ? :quote2 : to_yaml_style )
else
out.map( taguri, to_yaml_style ) do |map|
map.add( 'str', "#{self}" )
to_yaml_properties.each do |m|
map.add( m, instance_variable_get( m ) )
end
end
end
end
end
end
There appears to be a check there for strings that start with ':' and could be confused as Symbol when de-serializing, and the :quote2 option should be an indication to quote it during the encoding process. Adjusting this regular expression to catch the conditions described above does not appear to have any effect on the output, so I'm hoping someone more familiar with the YAML implementation can advise.