views:

636

answers:

2

I'm making a cross-platform application that renames files based on data retrieved online. I'd like to sanitize the Strings I took from a web API for the current platform.

I know that different platforms have different file-name requirements, so I was wondering if there's a cross-platform way to do this?

Edit: On Windows platforms you cannot have a question mark '?' in a file name, whereas in Linux, you can. The file names may contain such characters and I would like for the platforms that support those characters to keep them, but otherwise, strip them out.

Also, I would prefer a standard Java solution that doesn't require third-party libraries.

+3  A: 

Take a look at Apache Commons FilenameUtils, which has a lot of functions relating to file names.

e.g. it will 'normalise' file names, convert separators to the system separator, split the filename up into the appropriate components etc.

Brian Agnew
There's no built-in way to do this using only the standard Java libraries? I'd prefer not needed third-party libraries.
Ben S
AFAIK, there isn't a way to do this using the standard Java libraries. However, Apache Commons (or any Apache libraries) are the next best thing.
Thomas Owens
There are *lots* of things the standard libraries don't do that Apache Commons makes very, very easy. I now regard them (Commons Lang and Commons IO) as an essential component to add to any project.
Brian Agnew
A: 

It is not clear from your question, but since you are planning to accept pathnames from a web form (?) you probably ought block attempts renaming certain things; e.g. "C:\Program Files". This implies that you need to canonicalize the pathnames to eliminate "." and ".." before you make your access checks.

Given that, I wouldn't attempt to remove illegal characters. Instead, I'd use "new File(str).getCanonicalFile()" to produce the canonical paths, next check that they satisfy your sandboxing restrictions, and finally use "File.exists()", "File.isFile()", etc to check that the source and destination are kosher, and are not the same file system object. I'd deal with illegal characters by attempting to do the operations and catching the exceptions.

Stephen C