As noted in the comment on your question, I'm unsure what exactly you're asking.
So I'm assuming you're trying to convert Unicode characters into HTML entities. In which case, using one of the pre-made modules should be better. If that is not working due to encoding problems (which are quite tricky in Perl), then the answer to your question:
Is there not a encoding option like
open FILE, "<", $file or die "Cannot open:$!\n", "UTF-8";
... will probably solve it, and it would probably make your own attempt work as well, but better to use a ready-made one ;-) (by the way, the way you wrote it there was as a "UTF-8" option to die
which made it a little hard to understand what you were asking ;-)
Yes there is a UTF-8 option, assuming you have a recent perl
(>= v5.8):
open(my $fh,'<:encoding(UTF-8)', $file) or die "Error opening $file: $!";
(example adapted from perluniintro)
You can also use binmode
to change an already open filehandle (e.g. STDIN/OUT).
binmode(STDOUT, ":encoding(UTF-8)");
You can also set the default encoding with the open pragma.
But for this I suggest trying binmode
or changing your open line to see if that solves it.
If you have a perl
less than v5.8, things are trickier, but maybe resolvable if you tell us the version.
A couple of other things I noticed by the way:
- Not essential, but it's considered better to use a lexically scoped filehandle (
my $fh
instead of FILE
).
- When you put a newline on the
die
string, it suppresses the line number information that is normally added to help you find the problem.
- If you put the name of the file that couldn't be opened (or the SQL that failed, or whatever) in the die message it will be easier to debug.
- Don't use sub prototypes in Perl (5) : (
sub unicodeConvert($)
). Don't put the $
/@
/%
etc. in there. It doesn't just check things, it may change the meaning in confusing ways. It is only needed to create new "built-in style" operators.