« Farming in the nude? | Main | How to annoy co-workers »

April 15, 2005

Hard Links vs Soft Links

Someone pointed out that I had neglected the Tech Tips for the last two Fridays so I figured you were due an extra this week...

In Unix there are hard links and soft (symbolic) links. They are quite different:

  • A hard link is actually another directory entry that points to the same disk blocks
  • A soft link is a file which contains the path to another file

You never actually remove files on Unix, you decrement their hard link count. When it reaches zero, the file is effectively removed because there is no longer any way to reference it.

Consider the following:

user$ date > file1

user$ ln file1 file2

user$ ln -s file1 file3

user$ ls -li file*

1405281 -rw-r--r-- 2 user staff 29 Apr 21 09:38 file1

1405281 -rw-r--r-- 2 user staff 29 Apr 21 09:38 file2

1405284 lrwxr-xr-x 1 user staff 5 Apr 21 09:38 file3 -> file1

user$ cat file1

Thu Apr 21 09:38:14 EST 2005

user$ cat file2

Thu Apr 21 09:38:14 EST 2005

user$ cat file3

Thu Apr 21 09:38:14 EST 2005

user$ rm file1

user$ ls -li file*

1405281 -rw-r--r-- 1 user staff 29 Apr 21 09:38 file2

1405284 lrwxr-xr-x 1 user staff 5 Apr 21 09:38 file3 -> file1

user$ cat file2

Thu Apr 21 09:38:14 EST 2005

user$ cat file3

cat: file3: No such file or directory

The first command creates a file - this involves allocating some disk space (to hold the data). Then we create a hard link and a soft link. Note that the -i option to ls actually shows the real inode (i.e. the first block on the filesystem) and that this is identical for file1 and file2 - in otherwords, they are both references to the same thing.

When we cat the files, they are all the same. This is because Unix automatically dereferences the soft link (to find out where it goes) and then repeats the operation with the real file. This adds overhead to the call (because we have to look up two files - the soft link and then the file that the soft link points to) and should be kept in mind when looking at performance issues.

When we remove file1, the data still sits unchanged in the filesystem. It can be referenced via file2 but file3 is now useless because it no longer points to anything. Unix has no way of telling that file2 would be a reasonable replacement for file1.

Note that inodes are only unique within a filesystem and hence hard links do not work across filesystems. There is no way to create a hard link in say /var which connects to a file in /usr if the two are different filesystems.

Soft links cannot cope with the removal or relocation of the target file. Changing the target's name or directory will result in a "File not found" message.

Note that some operating systems (e.g. MacOS X) and some filesystems (AFS, HFS, HFS+) support another type of link called an alias. An alias is midway between a soft and a hard link. Effectively it is like a soft link to an inode. This is useful because, unlike a hard link, it can cross filesystem boundaries and it can cope, unlike a soft link, with the file path changing.

Posted by Ozguru at April 15, 2005 06:00 AM