views:

523

answers:

6

I know that / is illegal in Linux, and the following are illegal in Windows (I think): * . " / \ [ ] : ; | = ,

What else am I missing?

I need a comprehensive guide, however, and one that takes into account double-byte characters. Linking to outside resources is fine with me.

I need to first create a directory on the filesystem using a name that may contain forbidden characters, so I plan to replace those characters with underscores. I then need to write this directory and its contents to a zip file (using Java), so any additional advice concerning the names of zip directories would be appreciated.

A: 

The only illegal characters in modern filesystems are the directory separator (/ or \) and the NUL byte (\0).

Ignacio Vazquez-Abrams
And colon (:) in Windows.
James Keesey
NTFS (or at least the windows shell) forbids several other characters.
Mike
?, *, etc. aren't allowed under Windows.
Leonardo Herrera
+5  A: 

Windows has several restrictions on file names; not only are characters like *, ", ? and others forbidden, there are several reserved names like PRN and CON, and several length restrictions. The full list is on MSDN.

Dour High Arch
Excellent point. If only I remembered what `COPY CON` meant...
Adriano Varoli Piazza
The key phrase from the MSDN link is "[and a]ny other character that the target file system does not allow". There may be different filesystems on Windows. Some might allow Unicode, others might not. In general, the only safe way to validate a name is to try it on the target device.
Adrian McCarthy
+2  A: 

There's a nice table covering the file naming rules for various operating systems here. (part of the wiki article "Filename").

I. J. Kennedy
Iamamac
A: 

Here you have ALL File/Folder naming rules for windows: http://www.portfoliofaq.com/pfaq/FAQ00352.htm

and here you have ALL for linux: http://www.cyberciti.biz/faq/linuxunix-rules-for-naming-file-and-directory-names/

pit777
These rules are for the Portfolio application and are more restrictive than the actual filesystem rules. For example, all filesystems allow multiple periods in names, so long as they are not the first or last characters, and allow more than 3 characters for filename extensions.
Dour High Arch
+1  A: 

Well, if only for research purposes, then your best bet is to look at this Wikipedia entry on Filenames.

If you want to write a portable function to validate user input and create filenames based on that, the short answer is don't. Take a look at a portable module like Perl's File::Spec to have a glimpse to all the hops needed to accomplish such a "simple" task.

Leonardo Herrera
A: 

Under Linux and other Unix-related systems, there are only two characters that cannot appear in a directory name, and those are NUL '\0' and slash '/'. The '/', of course, can appear in a path name.

Rumour1 has it that Steven Bourne (of 'shell' fame) had a directory containing 254 files, one for each single letter (character code) that can appear in a file name. It was used to test the Bourne shell, and routinely wrought havoc on unwary programs such as backup programs.

Other people have covered the Windows rules.

Note that MacOS X has a case-insensitive file system.


1 I think it was Kernighan & Pike in 'The Practice of Programming' who said as much; it may have been in a related book (possibly even 'The UNIX Programming Environment').

Jonathan Leffler