Voix learning to read filesystems ;)
Voix learning to read filesystems ;)
... yes ... it's SLOW as hell ... and no, it can't deal with subdirectories yet ...
I'll have to fix both of those before I post any floppy images..
But it's working! Had to rewrite my VFS to make it deal with inode-less crap like FAT.
Ofcourse it properly reads files of any size (belov.py was there to troubleshoot issues with that) but I thought I'd rather take a bit more demonstrative screenshot.
I'll have to fix both of those before I post any floppy images..
But it's working! Had to rewrite my VFS to make it deal with inode-less crap like FAT.
Ofcourse it properly reads files of any size (belov.py was there to troubleshoot issues with that) but I thought I'd rather take a bit more demonstrative screenshot.
- Attachments
-
- voix-2007-04-11.png (7.66 KiB) Viewed 4851 times
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.
- Brynet-Inc
- Member
- Posts: 2426
- Joined: Tue Oct 17, 2006 9:29 pm
- Libera.chat IRC: brynet
- Location: Canada
- Contact:
- Kevin McGuire
- Member
- Posts: 843
- Joined: Tue Nov 09, 2004 12:00 am
- Location: United States
- Contact:
adventure into the depths of hell
I do not mind something being slow, but it can not deal with subdirectories? I want to see it deal with subdirectories! I want subdirectories nested so deep the computer runs out of memory trying to handle them..... yes ... it's SLOW as hell ... and no, it can't deal with subdirectories yet ...
Let me delve into the code. Who needs a pretty disk image anyway. We want to adventure into depths of hell which would include what you were thinking when you wrote it and posted it with out sub directories. I wanna know!
[edit]
Ok. I had to come back and edit this because I forgot to tell you, "Good Job" at 'at least' getting it to display and print files in the root directory.
Re: adventure into the depths of hell
The reason it doesn't deal with subdirectories is simply because I don't like duplication of code, and I hacked together the root-directory reader to get it working, and haven't yet refractored in into nice parts.Kevin McGuire wrote: I do not mind something being slow, but it can not deal with subdirectories? I want to see it deal with subdirectories! I want subdirectories nested so deep the computer runs out of memory trying to handle them...
The way I write code, is that I first design the general interface, then "hack" (in the sense that it's often not structured properly) together something that implements the interface to see if the interface is viable, or how things work, or whatever, and then when it works, I refractor the code into cleaner bits and pieces to make it more generally reusable.
In this case the biggest difference between root and subdirectories is simply that subdirectories are proper files, while root directory lives in a constant place. Since I needed root directory to reasonably debug reading files, I wrote it as a special case. So what I need to do know is make a copy of the root directory code, modify it to go through file reading (instead of directly to device), and then wrap around that with a special-case file for the root directory (plus '.' and '..' entries in root directory need special case handling too, and that's still missing as well as can be seen from the shot, but that's more or less trivial).
I'll at least fix the directory trouble first, and at least restructure read() to plan ahead, so that it doesn't need to switch between reading FAT and reading the file between each and every block (which is the main reason it's so damn slow atm).Let me delve into the code. Who needs a pretty disk image anyway. We want to adventure into depths of hell which would include what you were thinking when you wrote it and posted it with out sub directories. I wanna know!
My plan anyway is to make the source public in the near future. After that I might show older versions as well... I've got local subversion repo anyway.
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.
Ok, you'll get the current root reader here before I fix that stuff, since it probably helps you understand what I mean:
See? Not the most beautiful or generic code around.
Code: Select all
static int msdosfs_readroot(vfs_dir * dir, vfs_dentry * de) {
msdosfs_fs * fs = (msdosfs_fs*) dir->node->fs;
if(dir->offset == fs->boot.max_root_dirent) { return 0; }
// these are in sectors, and should be cached in msdosfs_fs
unsigned fat_offset = fs->boot.reserved_count;
unsigned root_offset = fat_offset
+ fs->boot.fat_sectors * fs->boot.fat_count;
msdos_dirent root[fs->boot.max_root_dirent];
if(vfs_seek((vfs_object*)fs->dev, fs->boot.bytes_sector * root_offset)
< 0) {
printk("msdosfs_readroot: seek failed\n");
return -1;
}
if(vfs_read((vfs_object*)fs->dev, (char*) &root, sizeof(root))
< (signed) sizeof(root)) {
printk("msdosfs_readroot: read failed\n");
return -1;
}
int i;
for(i = dir->offset; i < fs->boot.max_root_dirent; i++) {
if(!root[i].name[0]) return 0; // entry free / end-of-dir
if(root[i].name[0] == 0xE5) { continue; }
// skip volume label entries (used for long filenames)
if(root[i].attrib & 0x8) continue;
dir->offset = i + 1;
de->node = msdosfs_node((vfs_fs*) fs, &root[i]);
int n = 0, e = 0;
while(n < 8 && root[i].name[n] != ' ') {
de->name[n] = root[i].name[n];
++n;
}
while(e < 3 && root[i].ext[e] != ' ') {
if(!e) {
de->name[n] = '.';
++n;
}
de->name[n] = root[i].ext[e];
++e;
++n;
}
// iterate over the name and convert uppercase ASCII to lowercase
for(e = 0; e < n; e++) {
if(de->name[e] >= 'A' && de->name[e] <= 'Z') {
de->name[e] += 'a' - 'A';
}
}
de->name[n] = 0;
return 1;
}
return 0;
}
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.
I'd like to say that, at least for me, the FAT filesystem is the one best documented and known, and maybe that's the reason everyone chooses it at the earliest development stages.Brynet-Inc wrote:May I ask why you chose to implement support for FAT first? Floppy Disk's can use other file systems too
But It's still a good achievement, Congratulations
Also, if there were good and known documentation on the XFS filesystem, I'd gladly implement it before any other filesystem...
XFS, extN, ReiserFS, anyone with plenty of documentation, please??? We all need this very much!
My reasons are mostly practical: I've got two computers in active use. One of them is running Windows, and one of them is running Linux. The one running Windows has my only working floppy drive. Ofcourse normally I'd just write images on the other computer, and write them to real floppy using rawwrite, but it's nice to be able use the floppy directly if necessary.
FAT images are also easiest to deal with, because I can copy files to the disk (or well image) using mtools, without having to mount the image first.
Finally, FAT is sufficiently braindead, that I believe any virtual filesystem code that is able to deal with FAT, is likely to be able to deal with almost any other nonsense. As such, supporting FAT early means I get to deal with the VFS issues now, when there's little other code that needs to be changed.
Finally, if I take a floppy with me, I can mess with the contents (like GRUB config) on any computer with a floppy drive (and some sort of OS, even if that was just MS-DOS), meaning any potential test machine, without having to carry a Linux floppy/CD as well.
FAT images are also easiest to deal with, because I can copy files to the disk (or well image) using mtools, without having to mount the image first.
Finally, FAT is sufficiently braindead, that I believe any virtual filesystem code that is able to deal with FAT, is likely to be able to deal with almost any other nonsense. As such, supporting FAT early means I get to deal with the VFS issues now, when there's little other code that needs to be changed.
Finally, if I take a floppy with me, I can mess with the contents (like GRUB config) on any computer with a floppy drive (and some sort of OS, even if that was just MS-DOS), meaning any potential test machine, without having to carry a Linux floppy/CD as well.
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.
Well, right now it reads subdirectories as well. While file reading is now decent speed if reading in big chunks (it first plans what blocks it needs, so it doesn't need to consult FAT after every block) directory reading is still awful since there is no block cache yet, and every entry is read separately, it keeps reading FAT, then one entry, then FAT, then one entry, and so on...
Because of the readdir()+stat() nonsense I copied from POSIX, it actually keeps doing useless work, since every time you stat a file (to get attributes like size and whether it's a dir) it searches through the directory, one entry at a time... and every time it reads one entry for this search, it goes reading the FAT from the disk, then reads one block, and then for the next entry the FAT again, and probably the very same block again, and so on.
Root directory and files (when you read in decent chunks) don't suffer from this, as my floppy driver is clever enough: it reads using multi-tracked reads, two full tracks at a time, and remembers which set of tracks is currently in the DMA buffer, skipping the read if the revelant data is already there.
So another nice result from writing FAT first, instead of "better" filesystem, is that I catch stuff like API scalability issues (the whole POSIX style readdir()+stat() nonsense).
So mm... what is missing is a proper block cache. Maybe a dentry-cache as well, although I think the readdir()+stat() nonsense would be better solved by having readdir() just return the stat() information too (since the kernel has that info available in readdir() anyway). Also, I should have to figure out how to get rid of the duplicate code for the root-directory reading (making it use the normal directory reading code instead), but because directory sizes can't be known without finding all the blocks for a directory, that's surprisingly tricky to do in a sane way.
Because of the readdir()+stat() nonsense I copied from POSIX, it actually keeps doing useless work, since every time you stat a file (to get attributes like size and whether it's a dir) it searches through the directory, one entry at a time... and every time it reads one entry for this search, it goes reading the FAT from the disk, then reads one block, and then for the next entry the FAT again, and probably the very same block again, and so on.
Root directory and files (when you read in decent chunks) don't suffer from this, as my floppy driver is clever enough: it reads using multi-tracked reads, two full tracks at a time, and remembers which set of tracks is currently in the DMA buffer, skipping the read if the revelant data is already there.
So another nice result from writing FAT first, instead of "better" filesystem, is that I catch stuff like API scalability issues (the whole POSIX style readdir()+stat() nonsense).
So mm... what is missing is a proper block cache. Maybe a dentry-cache as well, although I think the readdir()+stat() nonsense would be better solved by having readdir() just return the stat() information too (since the kernel has that info available in readdir() anyway). Also, I should have to figure out how to get rid of the duplicate code for the root-directory reading (making it use the normal directory reading code instead), but because directory sizes can't be known without finding all the blocks for a directory, that's surprisingly tricky to do in a sane way.
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.
Ok, now we've got image ready (with some sort of block cache), can be found at http://www.cs.hut.fi/~tvoipio/files/voix/fd-363.img.gz
As usual /welcome tells (updated) random notes.
There's no umount/mount yet though, so if you want to test reading floppies other than the one provided, the only way is to copy kernel and init (the shell) to another floppy, and boot using that one instead
That said, it should be able to deal with any valid FAT12. Floppy driver can't deal with non-1.44MB disks yet though.
As usual /welcome tells (updated) random notes.
There's no umount/mount yet though, so if you want to test reading floppies other than the one provided, the only way is to copy kernel and init (the shell) to another floppy, and boot using that one instead
That said, it should be able to deal with any valid FAT12. Floppy driver can't deal with non-1.44MB disks yet though.
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.
- Kevin McGuire
- Member
- Posts: 843
- Joined: Tue Nov 09, 2004 12:00 am
- Location: United States
- Contact:
I tested it with qemu -fda <image> -m 32.
The block cache appears to be working. I noticed a significant performance improvement after the kernel read the data from the floppy and then attempted to read it again using cat. The directory listing and change directory command appeared to work as intended.
The command cat /dev/floppy appeared to also work as intended and displayed the contents of the floppy block device on screen. However I encountered the error:
floppy_do_sector: status = error
... <snip> ...
floppy_do_sector: 20 retries exhausted
blockdev_get_block: warning: read on 'floppy' failed.
read() failed: -1
I could take QEMU down to -m 12, -m 16, -m 24 in memory and it appeared to still work correctly.
The block cache appears to be working. I noticed a significant performance improvement after the kernel read the data from the floppy and then attempted to read it again using cat. The directory listing and change directory command appeared to work as intended.
The command cat /dev/floppy appeared to also work as intended and displayed the contents of the floppy block device on screen. However I encountered the error:
floppy_do_sector: status = error
... <snip> ...
floppy_do_sector: 20 retries exhausted
blockdev_get_block: warning: read on 'floppy' failed.
read() failed: -1
I could take QEMU down to -m 12, -m 16, -m 24 in memory and it appeared to still work correctly.
What did you do (or what you were doing) when you got that? I mean it seems it got a read error from the drive, which shouldn't happen on QEMU at all, so was that on QEMU or real hardware?Kevin McGuire wrote: The command cat /dev/floppy appeared to also work as intended and displayed the contents of the floppy block device on screen. However I encountered the error:
floppy_do_sector: status = error
... <snip> ...
floppy_do_sector: 20 retries exhausted
blockdev_get_block: warning: read on 'floppy' failed.
read() failed: -1
The floppy driver doesn't deal with stuff like changing floppies or removing them or stuff like that yet (in any sane way, at least) so if you tried to do that kind of stuff, then an error like the one above would be expected..
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.
- Kevin McGuire
- Member
- Posts: 843
- Joined: Tue Nov 09, 2004 12:00 am
- Location: United States
- Contact:
Yeah ok, I noticed. Going to fix it as the next thing I'll do.
I also know what the issue is: after I changed block devices to go through the cache, I forgot to add support for device size: there's now generic code to read block devices, which breaks the read/write request into blocks, then reads/writes the blocks through the cache, and cache goes to the actual device, requesting blocks to be read/written (to the cache).
In my old code, the floppy driver reported end-of-file when it was responding to the read/write requests itself. Now it simply gets a request for block. I forgot to do two things: the floppy driver should check that the block number requested is indeed valid, and there should be a way for driver to report the size of a device, so that block device generic read/write code could then check that reads/writes don't go beyond the end-of-device.
So ... thanks for catching two bugs.
I also know what the issue is: after I changed block devices to go through the cache, I forgot to add support for device size: there's now generic code to read block devices, which breaks the read/write request into blocks, then reads/writes the blocks through the cache, and cache goes to the actual device, requesting blocks to be read/written (to the cache).
In my old code, the floppy driver reported end-of-file when it was responding to the read/write requests itself. Now it simply gets a request for block. I forgot to do two things: the floppy driver should check that the block number requested is indeed valid, and there should be a way for driver to report the size of a device, so that block device generic read/write code could then check that reads/writes don't go beyond the end-of-device.
So ... thanks for catching two bugs.
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.
Ok fixed. Added about 5 short lines, changed one.
As an extra bonus I bothered actually making it so that in directory listings it shows the block-devices size.
I won't bother uploading an image just for that, because right now I'm working to fix some other more serious stuff as well, in newer code, and I'd have to backport to get a coherent image.
As an extra bonus I bothered actually making it so that in directory listings it shows the block-devices size.
I won't bother uploading an image just for that, because right now I'm working to fix some other more serious stuff as well, in newer code, and I'd have to backport to get a coherent image.
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.