Hello I have UTF file in uppercase and I want to change all words to lowercase.
I have tried:
`$ tr '[:upper:]' '[:lower:]' < input.txt > output.txt`
but that changes only cheracter without accent.
Thanks
Hello I have UTF file in uppercase and I want to change all words to lowercase.
I have tried:
`$ tr '[:upper:]' '[:lower:]' < input.txt > output.txt`
but that changes only cheracter without accent.
Thanks
This is because the default character classes only work on standard ASCII, which does not include most of the international accented characters. If you have a defined set of those characters, the easiest way would be to simply add the mapping from special uppercase character to special lowercase character manually:
tr 'ÄÖU[:upper:]' 'äöü[:lower:]'
If you only have a few accented characters, this is workable.
No, the issue is that tr
is not Unicode aware.
$ grep -o '[[:upper:]]' <<< JalapeÑo
J
Ñ
$ tr '[:upper:]' '[:lower:]' <<< JalapeÑo
jalapeÑo
The reason to use [:upper:]
, etc., is in order to handle characters outside ASCII. Otherwise, you could just use [A-Z]
and [a-z]
. That's also why PCRE has a character class called [:ascii:]]
:
$ perl -pe 's/[[:ascii:]]//g' <<< jalapeño
ñ
Finally the simplest way I found is to use awk:
awk '{print tolower($0)}' < input.txt > output.txt