views:

1309

answers:

1

I would like to deal with filename containing strange characters, like french é.

Everything is working fine in the shell :

C:\somedir\>ren -hélice hélice

Know if I put this line in a .bat file, I obtain the following result :

C:\somedir\>ren -hÚlice hÚlice

See ? é have been replaced by Ú

The same is true for command output. If I dir some directory in the shell, output is fine. If I redirect this output to a file, some character are transformed.

So how can I tell cmd.exe how to interpret what appears as an é in my batch file, is really an é and not a Ú or a comma

Edit : So there is no way when executing a .bat file to give an hint about the codepage in which it was written ?

+5  A: 

You have to save the batch file with OEM encoding. How to do this varies depending on your text editor. The encoding used in that case varies as well. For Western cultures it's usually CP850.

Batch files and encoding are really two things that don't particularly like each other. You'll notice that Unicode is also impossible to use there, unfortunately (even though environment variables handle it fine).

Alternatively, you can set the console to use another codepage:

chcp 1252

should do the trick. At least it worked for me here.

When you do output redirection, such as with dir, the same rules apply. The console window's codepage is used. You can use the /u switch to cmd.exe to force Unicode output redirection, which causes the resulting files to be in UTF-16.

As for encodings and code pages in cmd.exe in general, also see this question:

EDIT: As for your edit: No, cmd always assumes the batch file to be written in the console default codepage. However, you can easily include a chcp at the start of the batch:

chcp 1252>NUL
ren -hélice hélice

To make this more robust when used directly from the commandline, you may want to memorize the old code page and restore it afterwards:

@echo off
for /f "tokens=2 delims=:." %%x in ('chcp') do set cp=%%x
chcp 1252>nul
ren -hélice hélice
chcp %cp%>nul
Joey
chcp works in the .bat, even if echoing is wrong. However, if I do chcp 1252 in the console, and then type test.bat, it is still wrong...
shodanex
Of course it is wrong. `type` doesn't know anything about the codepage so it assumes the one you have currently set. What `chcp` in the batch does is *changing* that codepage, hence the differing results. I presented this more as a workaround anyway. The correct solution is to save the batch file in the correct encoding.
Joey
in fact, character "input" and character "output" to the screen are two different things. If I change the police of the console :chcp 850 followed by type gives me hÛlicechcp 1252 followed by type gives me héliceIt seems the default raster police codepage is not changed by chcp
shodanex
That only barely makes sense to me. But yes, essentially the codepage set with `chcp` determines (a) how built-in commands deal with encodings and (b) what characters can and will be displayed.
Joey