For lack of anything clever, here's what I came up with -- the meat of it, anyway.
Strictly speaking, it's probably the combination of OS and filesystem that determines the invalid character set, but for my purposes a hack based simply on the OS seems to be good enough.
Also, these invalid character sets are empirical, not official. The Windows invalid characters are taken from the error message you get in XP when you try to rename a file on an NTFS volume to something invalid. For Unix/Linux, I think you can get away with pretty much anything except a path separator (please correct me if you know better). For MacOS, whether :
or /
is the path separator seems to depend on the filesystem -- for my purposes it's safest just to include both. (And hope they're not mounting FAT or NTFS.)
List<Integer> invalidIndices = new LinkedList<Integer>();
String invalidChars;
if (OS.isWindows()) {
invalidChars = "\\/:*?\"<>|";
} else if (OS.isMacOSX()) {
invalidChars = "/:";
} else { // assume Unix/Linux
invalidChars = "/";
}
char[] chars = filename.toCharArray();
for (int i = 0; i < chars.length; i++) {
if ((invalidChars.indexOf(chars[i]) >= 0) // OS-invalid
|| (chars[i] < '\u0020') // ctrls
|| (chars[i] > '\u007e' && chars[i] < '\u00a0') // ctrls
) {
invalidIndices.add(i);
}
}
return invalidIndices;
Note: this is using the SwingX OS
utility class to determine the operating system, but if you don't have that, it doesn't do anything magical either -- just parses System.getProperty("os.name")
.