Page 1 of 2

FAT can't read big files.

Posted: Fri Jun 08, 2018 7:34 am
by GhelloWorld
Hello everyone,

Like I have said in my other topic I am still a beginner in developing operating systems. I have been working on a simple os now for 2 months or so and I am very happy with the things it can currently do. The next big thing I wanted for my os was a simple filesystem, the most common one seamed FAT32 so I chose that. The only things it can currently do is listing the first 16 items of the root directory, it can also read the contents of a file given by its filename (only 8.3 filenames). Reading a file works as expected until it gets above the size of a cluster. For some reason the first couple bytes are gone from the data, here is a example:

So I have a file named "MEGA.TXT" which contains only the word "MEGA, " a lot of times. When I read the file I get the following output:
MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA
MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA
MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA
etc.
Until this happens:
MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA
MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, Finding next cluster
Next cluster: 213
EGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA
MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA
MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA, MEGA
etc.
Every time it starts reading from the next cluster the first couple bytes are gone.

My fat reading code: https://github.com/Remco123/OSDev-Simpl ... t.cpp#L161
And the ata code: https://github.com/Remco123/OSDev-Simpl ... a.cpp#L102
and obviously the repo: https://github.com/Remco123/OSDev-Simple/

Any help would be much appreciated :lol:

Re: FAT can't read big files.

Posted: Fri Jun 08, 2018 8:07 am
by iansjack
Have you tried single-stepping it under a debugger to see what is happening?

What steps have you taken so far to isolate the problem?

Re: FAT can't read big files.

Posted: Fri Jun 08, 2018 8:36 am
by BenLunt
GhelloWorld wrote: Every time it starts reading from the next cluster the first couple bytes are gone.
If you have

Code: Select all

'MEGA, '
which is 6 chars, repeated through out the file, you do realize that each cluster (assuming 512-byte clusters) will not start with the "M". 512 divided by 6 has a remainder.

The second cluster (assuming 512-byte clusters) will start with

Code: Select all

'GA, MEGA, MEGA, '
Ben

Re: FAT can't read big files.

Posted: Sat Jun 09, 2018 9:09 am
by GhelloWorld
iansjack wrote:Have you tried single-stepping it under a debugger to see what is happening?
The only debug option I currently have is writing directly to the screen, of course it would be pretty easy to create a simple serial logger but that would just give me the same output. So no I have not tried single-stepping yet but that is definitely something I should consider implementing.
iansjack wrote:What steps have you taken so far to isolate the problem?
So far I have tried loading a bmp image from the disk and the text file I have mentioned before, I have also confirmed that files below the size of 1 cluster are loaded correctly. Just to be sure I have also tried different emulators like Qemu and Virtualbox.
BenLunt wrote:
GhelloWorld wrote: Every time it starts reading from the next cluster the first couple bytes are gone.
If you have

Code: Select all

'MEGA, '
which is 6 chars, repeated through out the file, you do realize that each cluster (assuming 512-byte clusters) will not start with the "M". 512 divided by 6 has a remainder.

The second cluster (assuming 512-byte clusters) will start with

Code: Select all

'GA, MEGA, MEGA, '
Ben
I do understand that the next cluster will indeed not start with "MEGA" again but this code should just add the buffer read from the cluster to the total buffer.

Code: Select all

                    hd->Read28(fileSector+sectorOffset, tempbuffer, 512);
                    
                    tempbuffer[SIZE > 512 ? 512 : SIZE] = '\0';

                    for(int i = 0; i < 512; i++)
                    {
                        buffer[bytesRead + i] = tempbuffer[i];
                    }

                    printf((char*)tempbuffer);

                    bytesRead += (SIZE > 512 ? 512 : SIZE);
So the final buffer should just have al the contents of the file right? Or am I missing something?
I will see what happens when I read a file with a even amount of characters.

Re: FAT can't read big files.

Posted: Sat Jun 09, 2018 11:36 am
by BenLunt
There could be one of numerous things that could be in error.

I would check:
1) Make sure that the file is actually written to the disk correctly, before hand. "Physically" dump the first few clusters from the file/disk to see that they are actually there to begin with. Your code might be correct, it might be the (emulated) disk that is in error.
2) Print the cluster number within your loop above, making sure that it is the correct next cluster number.
3) Does your Read28() function read 512 bytes or 512 sectors/clusters? Usually, you need to separate file system sizes and block sizes. For example, reading from a hard drive (or any other media) should use a count of blocks since this is the only form of data transfer; count of blocks. The file system may have a different size of "block", case in point, cluster size. In my opinion, your Read28() should expect a count of blocks to read, not bytes.
4) If you follow 3) above, your Read28() can read the whole cluster/file at once.

The next comment(s) don't pertain to your current issue, but you might want to keep this in mind for future development.
Keep media, file system, and current buffer/task sizes separate. i.e.:
1) The Media Read (Read28() in your case) should have no idea, nor care at all, what size a cluster is. It only cares about block sizes.
2) The File System Read should have no idea, nor care at all, what size a media block is. It only cares about cluster size.
3) The current task/library read should have no idea, nor care at all, what size a cluster or media block is, it should only care about bytes.

Therefore:
1) The current task/library (usually fread()) uses a count of bytes, which then calls the file system driver:
- 1a) One should also note that the fread() library might cache a set amount of bytes, only reading a set amount of bytes at a time, but this is for a later project.
2) The file system converts the passed count of bytes to cluster sizes, which then calls the media driver:
- 2a) ditto 1a.
3) The media driver converts the passed count of clusters to block sizes, which then reads from the media.
- 3a) ditto 2a
(Note, somewhere your conversion routine(s) will need to know both cluster size and block size to be able to convert correctly. It is up to you where this is done. I do it within the file system driver, just before calling the media device driver.)

Note that if the count of bytes to read is less than the current block (for media devices) or cluster (for file system drivers), it is up to that driver to extract only the requested amount of bytes. For example, (excluding any fread() caches), if the fread() requests 128 bytes, the file system still reads a cluster or a count of clusters to read at least 128 bytes, which then calls the media device driver which converts that amount to blocks, reading a block or count of blocks to satisfy the count of clusters.

As for fread() caches, you can now see why the library might want to use a cache if the fread() function was used with single byte reads (such as fgetch()). The library cache initialization code could query the file system driver for a cluster size, setting the library's cache to that size. Then again, the file system driver, if it wishes to have a cache, could query the block device for a block size and cache that amount of data.

Anyway, enough for future development notes. However, please keep this in mind since early development dictates what future development might look like.

Ben
- http://www.fysnet.net/osdesign_book_series.htm

Re: FAT can't read big files.

Posted: Sat Jun 09, 2018 11:59 am
by iansjack
I hope that you are running this under an emulator or VM. (To try to run on just real hardware at this stage is crazy.) In which case, what's the problem with running under a debugger?

Re: FAT can't read big files.

Posted: Sun Jun 10, 2018 4:38 am
by simeonz
If I am not mistaken, the bytesRead in your FAT code appears to be uninitialized.
On a side note, your printf implementation may not currently handle format strings, but in general, you should not pass arbitrary data buffers (like tempBuffer) in the format string argument.
Other than that, I am not sure what the problem is, but I can make two observations along the lines of BenLunt's residue remark. First, your first cluster read (assuming that the output you provided is the end of the first cluster) ends perfectly aligned on 6 bytes, which shouldn't be happening. Also, the next cluster begins on a file position 1 modulo 6 (meaning file pos = 6n+1), which is impossible for any cluster from the file. I would also temporarily comment all redundant checks and branches in the ATA code, just in case, since you are reading full sectors from the FS layer anyway.

Re: FAT can't read big files.

Posted: Mon Jun 11, 2018 11:11 am
by GhelloWorld
Thank you guys so far for all your responses.
So far I have tried a couple of suggestions that you gave me but the issue still persist. Just to be sure I have also reformatted the drive with a other linux distro to see if that was the issue but this is not the case (I used ubuntu first and tinycore just to be sure). So I think that the emulated disk is not the problem. The disk now only has one file named EVEN.TXT which contains the word "TESTABC," a bunch of times to check if it works when I use a even amount of characters, but as expected the result is still the same. I am also 99% sure that the read28 function reads 512 bytes but I have been programming in c++ for only 3 months now so I could be wrong.
iansjack wrote:I hope that you are running this under an emulator or VM. (To try to run on just real hardware at this stage is crazy.) In which case, what's the problem with running under a debugger?
Yes I am running this in a emulator but I don't really see why a debugger would help me in this particular case, it would be very useful when the OS would for example crash, but again this is not the case. I have a very simple printf function that does not handle formatted stings and I feel like that should be enough to debug this issue.
simeonz wrote:If I am not mistaken, the bytesRead in your FAT code appears to be uninitialized.
On a side note, your printf implementation may not currently handle format strings, but in general, you should not pass arbitrary data buffers (like tempBuffer) in the format string argument.
Other than that, I am not sure what the problem is, but I can make two observations along the lines of BenLunt's residue remark. First, your first cluster read (assuming that the output you provided is the end of the first cluster) ends perfectly aligned on 6 bytes, which shouldn't be happening. Also, the next cluster begins on a file position 1 modulo 6 (meaning file pos = 6n+1), which is impossible for any cluster from the file. I would also temporarily comment all redundant checks and branches in the ATA code, just in case, since you are reading full sectors from the FS layer anyway.
Thanks for noticing that the bytesRead was not initialized, I fixed that but it did not solve the problem unfortunately. Removing all the checks in the ATA code also did nothing. Do you have any clue why the clusters are acting so strange?

I will try to think of any other possible causes for this issue but right now I am desperate, perhaps making a filesystem was a bit to hard for a beginner in osdev :D

Re: FAT can't read big files.

Posted: Mon Jun 11, 2018 12:03 pm
by iansjack
GhelloWorld wrote:Yes I am running this in a emulator but I don't really see why a debugger would help me in this particular case, it would be very useful when the OS would for example crash, but again this is not the case.
Single-stepping through the code under a debugger - or setting breakpoints at critical points - would allow you to inspect the state of variables, buffers, registers, etc., to see what is happening and determine at which point things are not going as expected. I would have thought this was a useful step towards tracking down your error.

The cause of errors is often obvious when you see exactly where the code is failing. It's certainly a whole lot better than guessing.

Re: FAT can't read big files.

Posted: Mon Jun 11, 2018 1:36 pm
by zaval
It might be that you have errors in your Console class functions. Maybe you've messed up with global variables (like YOffset, XOffset) or something. But I was too lazy to search there. Try to printf something similar but without Fat reading, like:

Code: Select all


int SomeVariable = 10;

for(i=0; i<2; i++){
printf("MEGA, MEGA, MEGA, MEGA, ");
printf("finding next cluster\n");
printf(printf(Convert::itoa(SomeVariable));
printf("/n");
}

Re: FAT can't read big files.

Posted: Mon Jun 11, 2018 4:45 pm
by BenLunt
GhelloWorld wrote:Yes I am running this in a emulator but I don't really see why a debugger would help me in this particular case...
A good debugger will allow you to step through the code looking for errors. I have created a small (very simple) tutorial specifically for this reply. Have a look at http://www.fysnet.net/bochsdbg/index.htm.

This debugger allows you to step through your code, looking at register states, memory states, etc. You can use it to watch your buffer, watch your read code, etc.

Hope this helps. If you have any questions, please let me know. I might make a new topic just for this debugger question.

Ben

Re: FAT can't read big files.

Posted: Tue Jun 12, 2018 1:02 am
by iansjack
One case where a good debugger is invaluable is when code is overwriting a variable or the stack. The error that this triggers is often in a part of the code totally separate from that where the corruption occurs. In a large code base it can be almost impossible to track down this sort of error just by inspecting the code. But, using gdb, set a watch on the area of memory that is being corrupted; that way the execution breaks at the point where the memory is changed. Knowing where the real error occurs makes it far easier to trace the cause.

Re: FAT can't read big files.

Posted: Thu Jun 14, 2018 3:12 am
by GhelloWorld
iansjack wrote:
GhelloWorld wrote:Yes I am running this in a emulator but I don't really see why a debugger would help me in this particular case, it would be very useful when the OS would for example crash, but again this is not the case.
Single-stepping through the code under a debugger - or setting breakpoints at critical points - would allow you to inspect the state of variables, buffers, registers, etc., to see what is happening and determine at which point things are not going as expected. I would have thought this was a useful step towards tracking down your error.

The cause of errors is often obvious when you see exactly where the code is failing. It's certainly a whole lot better than guessing.
I think I misunderstood a bit how debugging a kernel works, I thought that I needed to implement a whole gdb-stub or create a debugger myself. I had no idea that for example bochs has a built-in one. Because bochs seemed the easiest one to use I tried to make this work in my linux machine, after some trouble I finally managed to get it working. Only I think I would prefer the gui-debugger but that doesn't seem to work on my machine. So I tried compiling it for windows but then I would need to download whole VS2013 just for compiling bochs and that seemed a bit stupid. Right now I have peter-bochs installed and that seems to work pretty nice. Only I really need to start learning assembly because all the output is in assembly.
BenLunt wrote: Have a look at http://www.fysnet.net/bochsdbg/index.htm
Thank you Ben for posting that tutorial, I think it will be very useful for me and other beginners. But as I have said before, for debugging I really need to learn assembly.
About the fat issue I think I am out of options right now, when I am in a further state of development I will try implementing fat again and see if the issue still happens, if it does I will let you guys know. I want to thank you all for helping me with this, you have been very usefull, for example with advising me a debugger.
I almost forgot, I have tested the printf function zaval but that works exactly like it should.

Re: FAT can't read big files.

Posted: Thu Jun 14, 2018 3:26 am
by iansjack
You can use gdb in conjunction with qemu for kernel debugging. This requires nothing extra of your code, just that you start qemu with the appropriate options. gdb allows you to work with assembler or C/C++. So you can step through C code, display (and alter) C variables - including structures, set breakpoints in C code and set watches (which break into the program when a variable changes) on C variables. It is an order of magnitude easier to work with C code this way than with an assembly-only debugger.

https://wiki.osdev.org/Kernel_Debugging ... _with_QEMU

Re: FAT can't read big files.

Posted: Thu Jun 14, 2018 7:29 am
by zaval
I almost forgot, I have tested the printf function zaval but that works exactly like it should.
but have you rechecked your Console code? it might be working perfectly if the length of the "MEGA, " string is small, but producing errors when it gets large. Try with the much longer string.