views:

650

answers:

2

Hello,

On my Linux server I have some files with accented names (test-éàïù.zip). When I add them to a new ZIP file using 7zip command-line tool, the charset/encoding information is not saved and when opened on a Windows computer, the archive does not correctly display filenames. I know that 7zip creates Zip V1.0 archives, not 2.0. Maybe the charset is limited to MSDos charset ? How could I specify an encoding using 7zip or another zip tool, in order to get portable archives ?

Thanks :)

+1  A: 

This is a superuser question, BUT...

ZIP uses a default codepage of IBM437. There is the possibility to use UTF-8, but not all zip tools and libraries support that. Some zip tools will do arbitrary code pages, even though the zip spec allows only IBM437 or UTF-8. I think WinRar is one such tool.

DotNetZip does encoding. It will do UTF-8 or an arbitrary code page. if you're writing an app, there is a .NET library. If you are running from a script, there are command line tools. Either way, DotNetZip requires .NET. You will need Mono to run it on Linux.

example for the command line:

zipit.exe Olivier.zip -cp 860 test-éàïù.txt

(to use the 860 codepage) I'm not sure that Windows Explorer correctly handles zipfiles with alternate encoding for the filenames within the zips.

See How to zip specified folders with Command Line for more info on that zipit.exe tool.

Cheeso
A: 

Hi, thanks for your replies.

I tried to use DotNetZip under Mono on Linux, it partially works, but not for all files, because Mono seems to have some limitations and bugs about that.

First I read in Mono's doc :

This problem arises when you have files in your file system that Mono can not convert into Unicode. Mono uses the UTF-8 encoding for the filenames stored in your file system by default, because it is the universally accepted standard. Alternatively, you can set the MONO_EXTERNAL_ENCODINGS variable, but this is not recommended. The problem with using MONO_EXTERNAL_ENCODINGS is that even if Mono will be able to parse your filenames, Mono will still store the filenames internall as Unicode. If you try to move, overwrite or do any other manipulation in the file Mono will transform the filename from Unicode to your native encoding and it might fail to find the file.

Indeed, I get some strange behaviors on the command-line : files can't be added if they contains spaces. It happens with file names passed through the command line arguments, but folders does not trigger any error, nor files inside directories.

root@portable:~/Bureau# ./ZipIt.exe archive.zip -cp 850 "AB CD.txt"
That zip file (archive.zip) already exists.
adding selection 'AB CD.txt' from dir '.'...
Exception: System.ArgumentException: 'AB CD.txt'
  at Ionic.FileSelector._ParseCriterion (System.String s) [0x00000]
  at Ionic.FileSelector..ctor (System.String selectionCriteria) [0x00000]
  at Ionic.Zip.ZipFile._AddOrUpdateSelectedFiles (System.String selectionCriteria, System.String directoryOnDisk, System.String directoryPathInArchive, Boolean recurseDirectories,Boolean wantUpdate) [0x00000]
  at Ionic.Zip.ZipFile.UpdateSelectedFiles (System.String selectionCriteria, System.String directoryOnDisk, System.String directoryPathInArchive, Boolean recurseDirectories) [0x00000]
  at Ionic.Zip.Examples.ZipIt.Main (System.String[] args) [0x00000]
Olivier
The problem with using filespecs with spaces - the zipit.exe tool requires a more verbose syntax. The file selector is actually pretty powerful - it lets you select files based on timestamps, attributes, names, and sizes. See http://stackoverflow.com/questions/1412051/how-to-zip-specified-folders-with-command-line/1424614#1424614 . So when you have a space in the name, specify "name = 'AB CD.txt'"
Cheeso