[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: How to delete a file??
Jeff Read wrote:
>
> "%full name%" wrote:
>
> > What's the differince in "unlinking" and "removing" a file?
>
> I could be wrong but I think removal is a special case of unlink, in
> which there is only one link remaining. The remove() function, in libc,
> really is just a wrapper for unlink() (when applied to a file).
This is a case where understanding some of the internals of UNIX
helps to understand what's going on.
In classical UNIX, there is a 'file' and an 'inode' (Information
node). The 'inode' stores all the information about the file such
as it's length, date-last-written, ownership, etc. A directory
entry is essentially just the name of the file and the number
of the inode. In early UNIX v6, a directory was quite literally
just an ordinary file containing 16 byte records with a 14 byte
filename and a 2 byte inode number. You could rename a file by
editing the directory with a simple text editor! (You could also
REALLY screw up your disk!) This is why early UNIX systems
limited you to 14 character filenames and at most 65536 files
on each disk partition.
Hence, when you 'link' to a file using 'ln' (not 'ln -s'), what
you do is to create a directory entry that points to the same
inode as the existing file's directory entry.
Hence, in classical UNIX, you don't "delete" a file - you simply
remove the directory entry. The 'inode' keeps an internal count
of the number of directory entries that are currently pointing
at it - and when that count hits zero, there are no more directory
entries pointing at the file - so UNIX deletes the file and its
inode. (If a program has a file open, then that increases the
inode's reference counter too - so if you delete a file that a
running program is using, it won't actually be erased until the
program closes the file - but since you unlinked the last directory
entry that points to it, the file will *seem* to have been deleted).
This sort of attention to detail is needed in a multitasking OS and
explains how UNIX/Linux is so reliable compared to Windoze which
had no such protections.
Anyway, that's the reason that the original file delete routine
isn't called "delete" since there is no way for a program to guarantee
to actually delete a file - all it can do is to 'unlink' it. If
the file isn't referenced by a running program - or some other
directory entry, then unlink will (in effect) actually delete it.
Of course since 99% of files are not linked from multiple places,
and we don't usually go round deleteing files that are in use by
a running program (eeekkk!) doing an 'unlink' will typically delete
the file.
Deleting a directory is a MUCH more complex operation, because
a directory is ALWAYS linked to from multiple places:
There is a link to the directory from the parent directory.
If the directory contains sub-directories then they have
".." entries (which are basically just links to their parent).
Every directory contains a link to itself (called ".").
It's important for the integrity of the file system that all those
links are correctly maintained - so unlink'ing a directory is
prohibited. That's why we have 'rmdir'.
The 'remove' call is just candy - a classically trained UNIX programmer
will probably not even be aware of it's existance.
Under some UNIX'en (eg Linux with libc5), it's just a synonym for
unlink - created (presumably) by someone who thought 'unlink' was
an illogical name for a delete function.
Under glibc/libc6, it's a dumb wrapper that calls 'unlink' for files
and 'rmdir' for directories.
I don't think you should use 'remove' because it's not portable.
One final note - symbolic links (created with 'ln -s') don't work
like classic UNIX links - there is an actual file and a 'link file',
if you delete the actual file, then it'll really disappear forever
leaving a 'dangling' link that points nowhere.
--
Steve Baker http://web2.airmail.net/sjbaker1
sjbaker1@airmail.net (home) http://www.woodsoup.org/~sbaker
sjbaker@hti.com (work)