views:

2079

answers:

10

Recently I was asked this during a job interview. I was honest and said I knew how a symbolic link behaves and how to create one, but do not understand the use of a hard link and how it differs from a symbolic one.

+30  A: 

Underneath the file system files are represented by inodes (or is it multiple inodes not sure)

A file in the file system is basically a link to an inode.
A hard link then just creates another file with a link to the same underlying inode.

When you delete a file it removes one link to the underlying inode. The inode is only deleted (or deletable/over-writable) when all links to the inode have been deleted.

A symbolic link is a link to another name in the file system.

Once a hard link has been made the link is to the inode. deleting renaming or moving the original file will not affect the hard link as it links to the underlying inode. Any changes to the data on the inode is reflected in all files that refer to that inode.

Note: Hard links are only valid within the same File System. Symbolic links can span file systems as they are simply the name of another file.

Martin York
I'm sure the i-nodes depend on the particular variant of the OS; however, I believe it is usually a single i-node. The i-node has info about the file and info about where the data is stored on disk. Large files will have indirect pointers to additional tables.
terson
You might want to add the useful feature that symbolic links can cross filesystems, hard links cannot (they must refer to a file on the same filesystem).
paxdiablo
A: 

This may help: Hard Link

Christopher Scott
+8  A: 

Symbolic links link to a path name. This can be anywhere in a system's file tree, and doesn't even have to exit when the link is created. The target path can be relative or absolute.

Hard links are additional pointers to an inode, meaning they can exist only on the same volume as the target. Additional hard links to a file are indistinguishable from the "original" name used to reference a file.

Andrew Medico
Also, when you remove the file you link to, a symbolic link gets broken, a hard link remains valid, because it "keeps" the file in the file system.
njsf
I assume *exit* should be *exist*?
Svish
+4  A: 

I would point you to Wikipedia.

http://en.wikipedia.org/wiki/Symbolic_link http://en.wikipedia.org/wiki/Hard_link

Couple of points

  • Symlinks can cross filesystems but not hard links (most of the time)
  • Symlinks can point to directories
  • Hard links point to a file and enable you to refer to the same file with more than one name.
  • As long as there is at least one link, the data is still available.

HTH.

Jauder Ho
In theory (and in some cases even in practice) hard links can point to directories as well (in fact "." is a hard link to the current directory and ".." is a hard link to the parent directory). But they can be dangerous, so most UNIXes don't allow them (or require you to take special steps to take it). Apple uses them for their time machine implementation for example: http://earthlingsoft.net/ssp/blog/2008/03/x5_time_machine
Joachim Sauer
+1  A: 

I add on Nick's question: when are hard links useful or necessary? The only application that comes to my mind, in which symbolic links wouldn't do the job, is providing a copy of a system file in a chrooted environment.

Federico Ramponi
Distributed system w/ mount points in different places on different systems. Of course, this could be designed out of the system by being consistent.
terson
I think @Tanktalus provided a great example.
Nick Stinemates
+9  A: 

Hard links are useful when the original file is getting moved around. For example, moving a file from /bin to /usr/bin or to /usr/local/bin. Any symlink to the file in /bin would be broken by this, but a hardlink, being a link directly to the inode for the file, wouldn't care.

Hard links may take less disk space as they only take up a directory entry, whereas a symlink needs its own inode to store the name it points to.

Hard links also take less time to resolve - symlinks can point to other symlinks that are in symlinked directories. And some of these could be on NFS or other high-latency file systems, and so could result in network traffic to resolve. Hard links, being always on the same file system, are always resolved in a single look-up, and never involve network latency (if it's a hardlink on an NFS filesystem, the NFS server would do the resolution, and it would be invisible to the client system). Sometimes this is important. Not for me, but I can imagine high-performance systems where this might be important.

I also think things like mmap(2) and even open(2) use the same functionality as hardlinks to keep a file's inode active so that even if the file gets unlink(2)ed, the inode remains to allow the process continued access, and only once the process closes it does the file really go away. This allows for much safer temporary files (if you can get the open and unlink to happen atomically, which there may be a POSIX API for that I'm not remembering, then you really have a safe temporary file) where you can read/write your data without anyone being able to access it. Well, that was true before /proc gave everyone the ability to look at your file descriptors, but that's another story.

Speaking of which, recovering a file that is open in process A, but unlinked on the file system revolves around using hardlinks to recreate the inode links so the file doesn't go away when the process which has it open closes it or goes away.

Tanktalus
+2  A: 

Hard links are very useful when doing incremental backups. See rsnapshot, for example. The idea is to do copy using hard links:

  • copy backup number n to n + 1
  • copy backup n - 1 to n
  • ...
  • copy backup 0 to backup 1
  • update backup 0 with any changed files.

The new backup will not take up any extra space apart from any changes you've made, since all the incremental backups will point to the same set of inodes for files which haven't changed.

JesperE
+1  A: 

This link provides a good explanation of the relationship between file names, inodes, and file data with respect to hard and soft/symbolic links. It summaries most of the points made in the other answers.

Ben Lever
Thanks. I really appreciate the link.
Nick Stinemates
A: 

Some nice intuition that might help, using any Linux console.

Create two files:

$ touch blah1; touch blah2

Enter some Data into them:

$ echo "Cat" > blah1
$ echo "Dog" > blah2

(Actually, I could have used echo in the first place, as it creates the files if they don't exist... but never mind that.)

And as expected:

$cat blah1; cat blah2
Cat
Dog

Let's create hard and soft links:

$ ln blah1 blah1-hard
$ ln -s blah2 blah2-soft

Let's see what just happened:

$ ls -l

blah1
blah1-hard
blah2
blah2-soft -> blah2

Changing the name of blah1 does not matter:

$ mv blah1 blah1-new
$ cat blah1-hard
Cat

blah1-hard points to the inode, the contents, of the file - that wasn't changed.

$ mv blah2 blah2-new
$ ls blah2-soft
blah2-soft
$ cat blah2-soft  
cat: blah2-soft: No such file or directory

The contents of the file could not be found because the soft link points to the name, that was changed, and not to the contents. Likewise, If blah1 is deleted, blah1-hard still holds the contents; if blah2 is deleted, blah2-soft is just a link to a non-existing file.

Adam Matan
A: 

Also:

  1. Read performance of hard links is better than symbolic links (micro-performance)
  2. Symbolic links can be copied, version-controlled, ..etc. In another words, they are an actual file. On the other end, a hard link is something at a slightly lower level and you will find that compared to symbolic links, there are less tools that provide means for working with the hard links as hard links and not as normal files
Amr Mostafa