I have some Perl code that translates new-lines and line-feeds to a normalized form. The input text is Japanese, so that there will be multi-byte characters.
Is it still possible to do this transformation on a byte-by-byte basis (which I think it currently does), or do I have to detect the character set and enable Unicode support? In other words, are the popular encodings (Shift-JIS, EUC-JP, UTF-8, ISO-2022-JP) using bytes as part of their character set that could be mistaken for ASCII control characters?
I need only CR and LF to work.
Update: Added ISO-2022-JP. And that is the one that looks the most troublesome with its funky escape sequences ...