Page 1 of 1

EXT2 - What's up with unlink()?

Posted: Tue Apr 22, 2025 8:33 am
by CaydendW
I normally wouldn't ask a question about this sort of thing here but I'm at my wits end and I need a guru somewhere to make it make sense.

I'm in the middle of redoing my EXT2 driver and everything is going great. I've got the CRU in CRUD done. Now, I have come across EXT2's unlink. This, is not super hard but there is one little snag that I am catching.

What does one do when you remove the directory entry? Old me solved this by just absorbing the directory entry into the preceeding one (If possible) and increasing its size. Job done.

Except, say that if you run a C program like this:

Code: Select all

#include <stdio.h>
#include <sys/stat.h>
#include <unistd.h>
int main() {
  char buf[1000] = {0};
  for (int i = 0; i < 5000; i++) {
    sprintf(buf, "/mnt/tmp/big/dir%04d", i);
    mkdir(buf, 0666);
  }
  for (int i = 0; i < 5000; i++) {
    sprintf(buf, "/mnt/tmp/big/dir%04d", i);
    rmdir(buf);
  }
  return 0;
}
Where there's an EXT2 partition mounted on /mnt. You'll add a tonne of directory entries to /mnt/tmp/big. The driver will allocate blocks for it so all these new directory entries can be accomodated. Then they're all removed. Under my system, this would entail not actually freeing up any blocks and just getting rid of the dentries, leaving a waste of blocks on /mnt/tmp/big! Basically, leaving 2 dentries (For . and ..) to occupy all those blocks allocated for the directories that just got removed.

This obviously would not do! So I went to check what OpenBSD (My beloved) does. Here is a relevant comment (And function decl if someone wishes to check me):

Code: Select all

/*
 * Remove a directory entry after a call to namei, using
 * the parameters which it left in nameidata. The entry
 * dp->i_offset contains the offset into the directory of the
 * entry to be eliminated.  The dp->i_count field contains the
 * size of the previous record in the directory.  If this
 * is 0, the first entry is being deleted, so we need only
 * zero the inode number to mark the entry as free.  If the
 * entry is not the first in the directory, we must reclaim
 * the space of the now empty record by adding the record size
 * to the size of the previous entry.
 */
int
ext2fs_dirremove(struct vnode *dvp, struct componentname *cnp)
The same thing I did... Ok, what about Linux?

Code: Select all

/*
 * ext2_delete_entry deletes a directory entry by merging it with the
 * previous entry. Page is up-to-date.
 */
int ext2_delete_entry(struct ext2_dir_entry_2 *dir, struct folio *folio)
This is just odd. Everyone seems to do it the naïve way? Maybe the filesystem gets cleaned up by some or other function? To test this, I mounted an EXT2 partition on a disk image using losetup and mount and ran my above C code. Leaving me with the following directory in /tmp/mnt/big:

Code: Select all

[cayden@lambda: ~]$ ls /mnt/tmp/big/
total 128K
drwxr-xr-x 2 cayden cayden 120K Apr 22 15:53 .
drwxrwxrwt 3 cayden cayden 4.0K Apr 22 15:51 ..
What gives? No one seems to actually clean the directory out. The directory is 30x bigger than it needs to be. I even ran fsck.ext2fs and that did nothing. Granted, the problem does seem difficult to solve but someone out there must have solved it. Right?

No matter how much search-engine-ing I did, I could not find a single mention of this issue. So, what's the deal? Does every developer just reckon that directories won't get filled enough for it to matter. Is there some secret tool that deals with this that I don't know about? What do you folks do in your OSes?

Any help will be greatly appreciated because I really don't get what's going on here.

Re: EXT2 - What's up with unlink()?

Posted: Tue Apr 22, 2025 1:30 pm
by nullplan
If I understand you correctly, you are asking when a directory shrinks. The answer is never. To the best of my knowledge, directories only ever increase in size and are never shrunk. Doing so is not exactly simple, since you'd need to verify each FS block of the directory is empty before unlinking the block from the directory, and there is no constant-time algorithm to do that. You'd have to scan the directory blocks to do this.

This isn't just ext2 either; FAT and OS-9's RBF work the same way. The only way to reduce a directory in size is to remove it, except the root directory cannot be removed of course, so there the only option is reformatting.

My guess is this problem was never solved because it is only significant on tiny media, like floppy disks. Anything bigger and you just don't care about an extra couple of kilobytes lost on the disk somewhere. And with floppy disks, you have the option of reformatting and copying back all the data, since the medium is so tiny.

Re: EXT2 - What's up with unlink()?

Posted: Tue Apr 22, 2025 1:51 pm
by iansjack
A few points.

Ideally, directories wouldn’t contain enough files for this to be a serious problem.

Directories are always going to occupy a minimum of one file system block (typically 4K) anyway.

This space isn’t really wasted; it’s going to be used again when new files are added. And if the directory got huge once logic dictates that it will probably get huge again; each directory will have a typical use pattern.

What you lose on the swings you gain on the roundabouts. A few kilobytes are wasted (who cares?), but efficiency is gained. You don’t need to waste time rearranging blocks, and next time the directory grows large its blocks will already have been allocated.

I think this is a (non-existent) solution looking for a (non-existent) problem.

Re: EXT2 - What's up with unlink()?

Posted: Mon Apr 28, 2025 1:35 pm
by iProgramInCpp
Surely you can truncate a directory though right? Like, if some 4KB blocks are filled with zeroes, just trim them out and reduce the size of the directory entry? Can't possibly that hard. I might look into it when implementing ext2 for my new operating system.

But having the possibility of the root directory being incredibly big for no reason and having no way to reclaim the used disk space can't possibly be good. Has ext3/4 solved this problem?

Re: EXT2 - What's up with unlink()?

Posted: Mon Apr 28, 2025 2:04 pm
by nullplan
iProgramInCpp wrote: Mon Apr 28, 2025 1:35 pm Like, if some 4KB blocks are filled with zeroes, just trim them out and reduce the size of the directory entry?
In ext2, directories are lists of directory blocks, each containing a list of directory entries. You could conceivably notice an empty block at some point and unlink it. However, doing so invalidates all saved offsets for later entries. Also, actually doing that becomes highly non-trivial once the directory is big enough to have exhausted the direct pointer entries. For good reason, UNIX operating systems have so far not had a way to add or remove data from the middle of a file.
iProgramInCpp wrote: Mon Apr 28, 2025 1:35 pm But having the possibility of the root directory being incredibly big for no reason and having no way to reclaim the used disk space can't possibly be good. Has ext3/4 solved this problem?
It cannot become "incredibly" big by today's disk size standards. We are talking Kilobytes here, stretching into the Megabytes in extreme cases. Today's disks are measured in Terabytes. Remember that on ext2, each file name cannot be larger than 256 bytes, so each directory entry cannot be larger than 264-ish bytes. So you'd need 3971 maximally-sized file names to fill up a megabyte of directory entries.

Ext3 has not solved this problem. At all. It has this directory hash thing, which actually makes removing things from the middle more of a problem. Ext4 might have solved one part of the problem with extent trees, obviating the need to shuffle the block list around, but the directory hash still exists.

Personally, I wouldn't attempt to invent a file system feature that so far, all the mainstream OSes are spurning. Seems like a highway to data loss.

Re: EXT2 - What's up with unlink()?

Posted: Tue Apr 29, 2025 1:46 am
by iansjack
If your root directory is getting "incredibly big" you have more serious problems than the filesystem used. On today's disks even a gigabyte is not a significant amount. Think how many directory entries could be stored in a gigabyte - somewhere in the region of 4 million or more. Your system would be at a standstill long before you reached that size.

Re: EXT2 - What's up with unlink()?

Posted: Tue Apr 29, 2025 7:36 am
by iProgramInCpp
nullplan wrote: Mon Apr 28, 2025 2:04 pm
iProgramInCpp wrote: Mon Apr 28, 2025 1:35 pm Like, if some 4KB blocks are filled with zeroes, just trim them out and reduce the size of the directory entry?
In ext2, directories are lists of directory blocks, each containing a list of directory entries. You could conceivably notice an empty block at some point and unlink it.
I was specifically talking about items at the *end* of a directory but I'm not sure I made that clear. Of course, you'd need to move every little file name (or the ones at the end) to the empty area to reclaim large swathes of storage space *within* the directory.