views:

154

answers:

2

Does any one know of any Java libraries I could use to generate canonical paths (basically remove back-references).

I need something that will do the following:

Raw Path -> Canonical Path

/../foo/       -> /foo
/foo/          -> /foo
/../../../     -> /
/./foo/./      -> /foo
//foo//bar     -> /foo/bar
//foo/../bar   -> /bar

etc...

At the moment I lazily rely on using:

 new File("/", path).getCanonicalPath();

But this resolves the path against the actual file system, and is synchronised.

   java.lang.Thread.State: BLOCKED (on object monitor)
        at java.io.ExpiringCache.get(ExpiringCache.java:55)
        - waiting to lock <0x93a0d180> (a java.io.ExpiringCache)
        at java.io.UnixFileSystem.canonicalize(UnixFileSystem.java:137)
        at java.io.File.getCanonicalPath(File.java:559)

The paths that I am canonicalising do not exist on my file system, so just the logic of the method will do me fine, thus not requiring any synchronisation. I'm hoping for a well tested library rather than having to write my own.

+2  A: 

I think you can use the URI class to do this; e.g. if the path contains no characters that need escaping in a URI path component, you can do this.

String normalized = new URI(path).normalize().getPath();

If the path contains (or might contain) characters that need escaping, the multi-argument constructors will escape the path argument, and you can provide null for the other arguments.

Note that URI normalization does not involve looking at the file system as File canonicalization does. But the flip side is that normalization behaves differently to canonicalization when there are symbolic links in the path.

Stephen C
Looks good. It still requires a little tweaking (to remove leading /../'s but it gets me most of the way there, thanks.
Joel
@Joel: Why do you want to remove leading `/../`? Either they are wrong and you should treat them as an error condition or you specify all paths to be relative to some point and you should support them. But silently removing them sounds like a bad idea.
Joachim Sauer
You are probably right, but I get all kinds of crappy data in and I'm just cleaning it up ensuring that all paths are rooted at /
Joel
+1  A: 

You could try an algorithm like this:

String collapsePath(String path) {
    /* Split into directory parts */
    String[] directories = path.split("/");
    String[] newDirectories = new String[directories.length];
    int i, j = 0;

    for (i=0; i<directories.length; i++) {
        /* Ignore the previous directory if it is a double dot */
        if (directories[i].equals("..") && j > 0)
            newDirectories[j--] = "";
        /* Completely ignore single dots */
        else if (! directories[i].equals("."))
            newDirectories[j++] = directories[i];
    }

    /* Ah, what I would give for String.join() */
    String newPath = new String();
    for (i=0; i < j; i++)
        newPath = newPath + "/" + newDirectories[i];
    return newPath;
}

It isn't perfect; it's linear over the number of directories but does make a copy in memory.

haldean