views:

1186

answers:

6

Consider a Save As dialog with a free text entry where the user enters a file name as free text, then clicks a Save button. The software then validates the file name, and saves the file if the name is valid.

On a Unix file system, what rules should be applied in the validation such that:

  • The name will not be difficult to manipulate later in terms of escaping special characters, etc.
  • The rules are not so restrictive that saving a file becomes non-user-friendly.

So basically, what is the minimum set of characters that should be restricted from a Unix file name?

A: 
Gavin Miller
Then what characters should be whitelisted?
Pim Jager
The characters given so far sound good. A hyphen and a period would be good additions as well.
workmad3
@workmad3 - good suggestions; I made the answer wiki, feel free to add.
Gavin Miller
Newlines are a nuisance. Commas are pretty harmless. Colon would do no damage in Unix, but are problematic if the name is copied to Windows - or if the 'file' is a directory that might need to be added to PATH.
Jonathan Leffler
There is some room to argue that any characters classified as 'isalpha()' in the current locale are OK - that allows people to use accented characters in the names. It complicates the story, though.
Jonathan Leffler
i for one will regard anything that probits accented characters as user-unfriendly
hop
+4  A: 

The minimum are slash ('/') and NULL ('\0')

mouviciel
The minimum is /, ; and | to avoid the user running arbitrary commands (assuming it's not escaped :))
workmad3
This. No characters besides '/' should be disallowed.
Andrew Medico
And ASCII NUL '\0' since that marks the end of the file name :D
Jonathan Leffler
This is the rigourous answer. The application should be coded to assume that the user was this unconstrained (so when opening files, it should accept any name). It isn't such a good answer for saving (new) files; it is reasonable to put some limits on the file names.
Jonathan Leffler
+2  A: 

Let the user enter whatever name he wants. Artificially restricting the range of characters will only annoy the users and serve no real purpose.

Bombe
sounds good... I'll enter a file called 'blah;rm -rf /' ;)
workmad3
+1 for the comment!
Gavin Miller
Or, better: '$(rm -fr $HOME)' (minus the single quotes) as the file name? That will wreak havoc sooner rather than later. Backticks and $(...) are particularly pernicious as they 'work' when the file name is quoted, unlike most of the other special characters. Embedded quotes are tricky, too.
Jonathan Leffler
Those are all non-issues when saving the filename. fopen() doesn’t care about your filenames. When using a graphical shell (e.g. konqueror) it doesn’t care about your filenames. When you use auto-completion in the shell it doesn’t care about your filenames. So what are your points? :)
Bombe
@Bombe, what one user might want in many cases will alienate other users, regardless of the havoc it plays with your UI development process. Bad idea.
le dorfier
That’s my point: choosing strange names will not wreak havoc with anything—unless your “anything” is badly written. None of the standard tools of UNIX is badly written. Again: what’s your point?
Bombe
+2  A: 

Do not forget the dot (.) so that you can hide files and folders... Otherwise, I'd follow a UN*X name convention (from wikipedia):

Most UNIX file systems

  • Case handling: case-sensitive case-preservation
  • Allowed character set: any
  • Reserved characters: / null
  • Max length: 255
  • Notes: A leading . indicates that ls and file managers will not by default show the file

Link to wikipedia article about file names

Tobias Wärre
A: 

Often forgotten: the colon (:) is not a good idea, since it's commonly used in stuff like $PATH, i.e. the list of directories where executables are found "automatically". This can cause confusion with DOS/Windows directory names, where of course the colon is used in drive names.

unwind
A: 

Please don't use spaces! They work, but can be a big pain to work with on the command line.

Judge Maygarden