views:

455

answers:

5

What are the best tools/programming-techniques for following a complicated nesting of symlinks and completely capturing and reporting on every symlink along the way, including those in the middle of a path (See below for more info).

Here's a specific example. Consider the following output from a shell command

 ls -l /Library/Java/Home
 lrwxr-xr-x  1 root  admin  48 Feb 24 12:58 /Library/Java/Home -> /System/Library/Frameworks/JavaVM.framework/Home

The ls command lets you know that the file /Library/Java/Home file is a symlink to another location. However, it doesn't let you know that the thing it's pointing to is also a symlink

ls -l /System/Library/Frameworks/JavaVM.framework/Home
lrwxr-xr-x  1 root  wheel  24 Feb 24 12:58 /System/Library/Frameworks/JavaVM.framework/Home -> Versions/CurrentJDK/Home

This, in turn, doesn't let you know that part of the path of the pointed to file is a symlink.

ls -l /System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK
lrwxr-xr-x  1 root  wheel  3 Feb 24 12:58 /System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK -> 1.5

Which, just to complete this tale, is another symlink

ls -l /System/Library/Frameworks/JavaVM.framework/Versions/1.5
lrwxr-xr-x  1 root  wheel  5 Feb 24 12:58 /System/Library/Frameworks/JavaVM.framework/Versions/1.5 -> 1.5.0

Finally pointing at a "real" folder.

Are there any tools that can visualize the full chain of links for you in some way (either graphically or plain old text)? I'm sure one could script this themselves (and if you want to, please do it and share!), but it seems like the kind of thing that would be fraught with "oh, crap, edge case. Oh, crap, ANOTHER edge case". I'm hoping someone's already gone to the bother.

I do freelance/contract work, and everyone uses symlinks slightly differently to install their PHP applications on a web-server. Half my job is usually un-nesting this (inevitably) undocumented hierarchy so we know where to put our new code/modules.

A: 

If you just need to know the ultimate referent, php has the realpath function which can do that.

Paul Dixon
Good info, and thank you, but not quite what I'm after. I want something that's going to let me reverse engineer the entire hierarchy.
Alan Storm
+1  A: 

Tcl has a command [file type $filename] that will return "link" if it's a link. It has another command [file link $filename] that will return what the link points to. With those two commands it's possible to take a link and follow the links until you get to an actual file.

Perhaps something like this off the top of my head:

#!/usr/bin/tclsh

proc dereferenceLink {path {tree {}}} {
    if {[file type $path] == "link"} {
        set pointsTo [file link $path]
        if {[lsearch -exact $tree $path] >= 0} {
            lappend tree $path
            return "[join $tree ->] (circular reference)"
        } else {
            lappend tree $path
            return [dereferenceLink $pointsTo $tree]
        }
    } else {
        lappend tree $path
        return [join $tree "->"]
    }
}

puts [dereferenceLink [lindex $argv 0]]

You'll get output that looks like:

foo->bar->baz

If there's a circular link it will look like:

foo->bar->baz->foo (circular reference)

Bryan Oakley
Useful, but still sounds like work :)
Alan Storm
That depends on your definition of "work" :-)
Bryan Oakley
Work meaning "writing the script myself to capture all the edge cases, handle circular logic, capture symlinks that are not the endpoint of a path, and handle other edge cases I'm not considering"
Alan Storm
I think the posted code does more-or-less what you want. But I get your point -- you're asking if there's a standard utility that already exists. However, rolling your own solution is not hard and might make a good learning experience.
Bryan Oakley
Appreciate the advice, but time's at a premium these days and I'd rather spend it on the unsolved problems than the solved problems. FYI, your code wouldn't catch symlinks in the middle of paths (such as CurrentJDK from above), and it craps out when the symlink is a non-absolute path.
Alan Storm
The problems you mention are trivially solved of course (in Tcl and most other modern scripting languages), but I see you want to hold out for something better so I won't pursue it.
Bryan Oakley
How can most modern scripting language solve the problem of scope creep and unforeseen edge cases? ;)
Alan Storm
+1  A: 

This python script would do it, if you added a single print inside the loop:

http://mail.python.org/pipermail/python-ideas/2007-December/001254.html

Graham Lee
Not quite, that script would only give you /Library/Java/Home, and/Library/Java/Versions/CurrentJDK/Home. It wouldn't give the information that CurrentJDK is a symlink.
Alan Storm
+1  A: 
Marty Lamb
Useful, and worth a cote, but I'm interested in seeing the full tree of possible symlinks. Even if we modified this to output each path, it would miss that /Library/Java/Versions/CurrentJDK is a symlink.
Alan Storm
Ah, I see what I missed in your examples. I'm not sure how to concisely represent this in a text format. You expressed an interest in the "full tree", but it's a digraph... perhaps graphviz (http://graphviz.org) is worth a look. It can convert a simple text format into a graphical diagram.
Marty Lamb
Yeah, that's part of why I wanted an existing tool. I'm not sure what the best way to represent something like that would be
Alan Storm
A: 

In PHP, you can use is_link and readlink.

Example usage:

function dereference_link($path) {
  $parts = array();
  foreach (explode(DIRECTORY_SEPARATOR, $path) as $part) {
    $parts[] = $part;
    $partial = implode(DIRECTORY_SEPARATOR, $parts);
    if (is_link($partial)) {
      $result = dereference_link(readlink($partial) . substr($path, strlen($partial)));
      array_unshift($result, $path);
      return $result;
    }
  }
  return array($path);
}
troelskn
I know the lazy web doesn't read, but I want something that will report every symlink along the way, and not just report the end point.
Alan Storm
That is exactly what readlink does.
troelskn
No, it doesn't. If I run your script on my above example, I get /Library/Java/Home -> /System/Library/Frameworks/JavaVM.framework/Home/System/Library/Frameworks/JavaVM.framework/Home -> Versions/CurrentJDK/Home. I get no information that CurrentJDK is a symlink.
Alan Storm
Ah .. I see what you mean now. You could split the path by DIRECTORY_SEPARATOR for each path and check each segment with is_link to get this. I guess we're passed "easy" then, but it's doable in a dozen lines of code or so.
troelskn
Yeah, and then there's another edge case I didn't consider, and then the question of how best to visualize it, etc. etc. This is the kind of thing that seems simple, but is actually deep, and that's why I wanted an existing tool
Alan Storm
It's a good strategy to not reinvent the wheel (esp. on something as basic as filesystem operations), but I think this solution is fairly complete. The visualisation part would be pretty individual anyway.
troelskn