tags:

views:

135

answers:

3

I have 'Malformed UTF-8 character' error when I'm putting some scalar data in XML::Simple or Data::Dumper. There are regular expressions on the lines where the error occurs.

Malformed UTF-8 character (fatal) at /usr/share/perl5/XML/Simple.pm line 1690.
Malformed UTF-8 character (fatal) at /usr/lib/perl/5.10/Data/Dumper.pm line 682.

At the moment I failed to reproduce the error with a small piece of code.

XML::Simple 2.18
Data::Dumper 2.124
perl v5.10.1
A: 

The core Encode module provides facilities for Handling Malformed Data. I never used them myself, though.

eugene y
> Some encodings ignore CHECK argument. For example, Encode::Unicode ignores CHECK and it always croaks on error.:S
codeholic
A: 

You could try piping your data through Encoding::FixLatin. If the 'binary' bytes you're encountering are actually Latin-1 characters then they'll get converted to valid UTF8. If they really are random binary bytes then they should at least get converted to random (but valid) UTF8 characters :-)

Grant McLean
Perhaps they should, but XML::Simple 2.18 doesn't like them :) Latin-1 characters are not enough unfortunately.
codeholic
+1  A: 

The problem arose because somewhere deep in the code of the application there was Encode::_utf8_on with a scalar, that wasn't a proper UTF-8 string.

codeholic
I know, now it sounds quite stupid. I'm not sure if I should delete this question.
codeholic
Don't delete it, let others learn from your mistake.
Alan Moore
Actually it was not even my mistake, it was just someone other's bug in the code :)
codeholic
You could have had that answer 4 days ago.
daxim
@daxim I could, but I was very busy at work.
codeholic