I was working on parsing an excel file having japanese files in some of the cells.
By using Spreadsheet::ParseExcel (Ver. 0.15) (which I know is older than current version)
some of the cells with below characters
<設定B-1コース>
are appearing as in
print Dumper $oWkc->{_Value};
$VAR1 = "\x{ff1c}\x{8a2d}\x{5b9a}B-\x{ff11}\x{30b3}\x{30fc}\x{30b9}\x{ff1e}";
and
print $oWkc->{Val} . "\n";
[-0
$VAR1 = "\x{ff1c}\x{8a2d}\x{5b9a}B-\x{ff13}\x{30b3}\x{30fc}\x{30b9}\x{ff1e}"; [-0 If I want to get these values printed in actual foramat, I am setting the STDOUT File handle to ":utf8". and also my terminal to point to utf 8 encoding, (otherwise i am getting some "wide character" warning). Here I have to pick cells with B-1 or B-2 , but I am not sure what should be set inside my script so that these characters can be treated as what I am able to see them on STDOUT.
Currently I am using regular expression to convert these wide characters to corresponding ascii value. as an example if I want to match B-1 which is stored as 'B-\x{ff11}' , I will be
$oWkc->{_Value} =~ /([AB]-)(\x{ff11}|\x{ff12}|\x{ff13}/
my $lookup = $1.$2;
$lookup =~ s/\x{ff11}/1/;
$lookup =~ s/\x{ff12}/2/;
$lookup =~ s/\x{ff13}/3/;
For reference, B-1, A-2 etc these values are coming from some other source, and currently are ranging from A|B-[1-3].
What is the standard way to deal with these wide characters. I am not able to use encode/decode etc . Can any one give me some direction .. Currently though I am able to get the work done using Regex ...