Why we address RAM directly but use file system for HDD?
Why we address RAM directly but use file system for HDD?
I tried to be brief in my topic title, sacrificing clarity.
To clarify, what is the good reason that in most of our programs we adress the memory directly, while silently accepting to access permanent storage like the hard drive through the abstraction layer known as the file system?
I am after a good reason. I wonder if there is one though, or is it only because of historical reasons? After all, the RAM and the HDD are not very different beasts, in fact even modern operating systems offer APIs to use either as the other one - we having swap/paging and we have memory-mapped files.
Is there a good reason not to [more or less safely] address bytes on the disk, as we do with RAM? Is it because an area on the disk not belonging to us is a complete taboo to access?
I am just wondering here...
Of course, since I mention "adressing memory", perhaps we can assume C/C++ applications on a POSIX system.
To clarify, what is the good reason that in most of our programs we adress the memory directly, while silently accepting to access permanent storage like the hard drive through the abstraction layer known as the file system?
I am after a good reason. I wonder if there is one though, or is it only because of historical reasons? After all, the RAM and the HDD are not very different beasts, in fact even modern operating systems offer APIs to use either as the other one - we having swap/paging and we have memory-mapped files.
Is there a good reason not to [more or less safely] address bytes on the disk, as we do with RAM? Is it because an area on the disk not belonging to us is a complete taboo to access?
I am just wondering here...
Of course, since I mention "adressing memory", perhaps we can assume C/C++ applications on a POSIX system.
Re: Why we address RAM directly but use file system for HDD?
I'm not really sure of the answer but the concept that you talk of, where RAM and disk storage are treated alike is one that has been used. The most notable example is the very sucessful OS/400 which IBM's iSeries computers run. It's called single-level storage.
It would certainly be an interesting idea to implement in a hobby OS. Makes a change from the many Posix clones.
It would certainly be an interesting idea to implement in a hobby OS. Makes a change from the many Posix clones.
Re: Why we address RAM directly but use file system for HDD?
> we address RAM directly
That's not true really. First layer ( can I call it an abstraction? ) is physical memory - chipset can use smth like memory bank interleaving, MMIO but you see only linear address space. Second - virtual memory (different kinds of segmentation and paging). Third - data structures and classes in programming language.
Purpose of this layers are almost the same as for filesystem - divide resources between tasks, do access control and hide (abstract) some implementation details.
That's not true really. First layer ( can I call it an abstraction? ) is physical memory - chipset can use smth like memory bank interleaving, MMIO but you see only linear address space. Second - virtual memory (different kinds of segmentation and paging). Third - data structures and classes in programming language.
Purpose of this layers are almost the same as for filesystem - divide resources between tasks, do access control and hide (abstract) some implementation details.
Re: Why we address RAM directly but use file system for HDD?
Loosely speaking, physical memory is a class of storage, about the same layer as disk storage; and you can have file system on top of that storage, it's called ram disk.
Re: Why we address RAM directly but use file system for HDD?
I guess the problem then is not whether to have or not to have abstractions like a file system. Obviously, the MMU is an abstraction layer as well, albeit implemented in the hardware and mandated by operating system. Still, MMU only manages security of memory access - it does not introduce "files" with their own names, creation and modification dates etc, which a file system typically does.
In so far, a pure disk access system or should we call it DMU for Disk Management Unit ("disk" for the absense of a better short term for slow permanent storage) just to compare it conceptually to an MMU, should only concern itself with safe multiplexing of the resource that is the storage medium. This includes not letting separate entities garble and optionally, peek into each others data. A file system is much more than that!
This is what exokernels do, but I am wondering how did we get into the problem in the first place? Is there a GOOD reason to impose a file system across the entire software stack? A prison from which one cannot escape? Yes, I admit 99% of all applications out there would not even notice the absense of a file system, provided they'd have a way to store data on disk, but likewise, the 1% that would benefit, would do so at great benefit. This is among other things suggested by research into MIT's own exokernel implementation running the Cheetah webserver.
Still, without a file system, would we really suffer that much? I mean, we do crawl along accessing memory as chunks of one big array, don't we? I guess I can introduce a sub question - do we strip file systems into simpler constructs, or do we rather impose a file-like system onto RAM as well, doing away with array-like access? Or both for both?
In so far, a pure disk access system or should we call it DMU for Disk Management Unit ("disk" for the absense of a better short term for slow permanent storage) just to compare it conceptually to an MMU, should only concern itself with safe multiplexing of the resource that is the storage medium. This includes not letting separate entities garble and optionally, peek into each others data. A file system is much more than that!
This is what exokernels do, but I am wondering how did we get into the problem in the first place? Is there a GOOD reason to impose a file system across the entire software stack? A prison from which one cannot escape? Yes, I admit 99% of all applications out there would not even notice the absense of a file system, provided they'd have a way to store data on disk, but likewise, the 1% that would benefit, would do so at great benefit. This is among other things suggested by research into MIT's own exokernel implementation running the Cheetah webserver.
Still, without a file system, would we really suffer that much? I mean, we do crawl along accessing memory as chunks of one big array, don't we? I guess I can introduce a sub question - do we strip file systems into simpler constructs, or do we rather impose a file-like system onto RAM as well, doing away with array-like access? Or both for both?
Re: Why we address RAM directly but use file system for HDD?
That's the job of the OS. It usually associates a memory context with a running process, that has a name, creation date, etc. All of this metadata is stored somewehere in RAM, like filesystem metadata is stored on disk.amn wrote:I guess the problem then is not whether to have or not to have abstractions like a file system. Obviously, the MMU is an abstraction layer as well, albeit implemented in the hardware and mandated by operating system. Still, MMU only manages security of memory access - it does not introduce "files" with their own names, creation and modification dates etc, which a file system typically does.
So what is a file system?A file system is much more than that!
Can you do without malloc()? I guess so. Would it be a good idea? Probably not. It's the same for using raw disks without a file system.Still, without a file system, would we really suffer that much? I mean, we do crawl along accessing memory as chunks of one big array, don't we? I guess I can introduce a sub question - do we strip file systems into simpler constructs, or do we rather impose a file-like system onto RAM as well, doing away with array-like access? Or both for both?
-
- Member
- Posts: 170
- Joined: Wed Jul 18, 2007 5:51 am
Re: Why we address RAM directly but use file system for HDD?
It is necessary to have a file system on disk if files are larger than the memory address space.
This happens all the time on 32bit CPUs.
It would be less of an issue on 64bit CPUs, but then there are some awfully huge databases that may be more than 2^63 bytes in size.
This happens all the time on 32bit CPUs.
It would be less of an issue on 64bit CPUs, but then there are some awfully huge databases that may be more than 2^63 bytes in size.
-
- Member
- Posts: 170
- Joined: Wed Jul 18, 2007 5:51 am
Re: Why we address RAM directly but use file system for HDD?
I have thought of another reason why it is necessary to have a file system on disk.
Journalling.
Disks need to be able to roll back to a known good state.... I guess this could be done at the code level but it makes writing code a lot more complicated.
Journalling.
Disks need to be able to roll back to a known good state.... I guess this could be done at the code level but it makes writing code a lot more complicated.
Re: Why we address RAM directly but use file system for HDD?
I don't think it's anything to do with the size of the address space (after all, even with a file system we need to use pointers in some form that are large enough to encompass the whole disk) or even journalling (you can easily envisage journalling without a file system) It's a simple matter of imposing some sort of order on the objects that we wish to use, and particularly persistent objects. The most convenient way to refer to persistent objects is by giving them a name rather than an address, and we further structure this with hierarchies of directories. Otherwise, managing all those objects would be nigh impossible. That's what we call a file system, although the term also covers the underlying algorithms and on-disk structure used to manage that structure.
OS/400 does the opposite of what has been discussed here (treating disks as raw data); rather it treats everything as named objects. But the programmer doesn't care whether those objects are stored in RAM or on DASD. They are just stored somewhere in the single level of storage. The OS takes care of all that.
OS/400 does the opposite of what has been discussed here (treating disks as raw data); rather it treats everything as named objects. But the programmer doesn't care whether those objects are stored in RAM or on DASD. They are just stored somewhere in the single level of storage. The OS takes care of all that.
Re: Why we address RAM directly but use file system for HDD?
This doesn't make any sense. Whether or not to have a file system is completely orthogonal to the size of the disk, let alone the memory.tom9876543 wrote:It is necessary to have a file system on disk if files are larger than the memory address space.
I wonder where you get the storage for these databases from. And even if you had it, common file systems don't support file sizes up to 2^63 bytes.but then there are some awfully huge databases that may be more than 2^63 bytes in size.
Journaling is mostly an optimisation, not a feature per se.tom9876543 wrote:I have thought of another reason why it is necessary to have a file system on disk.
Journalling.
-
- Member
- Posts: 170
- Joined: Wed Jul 18, 2007 5:51 am
Re: Why we address RAM directly but use file system for HDD?
Alright, today you can have an 8 gigabyte MP4 movie file.
In the theoretical "no fileystem" world.....
Please explain how a 32 bit CPU would CREATE this "movie", if there was no file system.
Then explain how the CREATE application would share the new "movie" with the PLAYBACK application and the EMAIL application.
Then explain how the CREATE application would manage creating 50 or 5000 different movies.
Thanks.
In the theoretical "no fileystem" world.....
Please explain how a 32 bit CPU would CREATE this "movie", if there was no file system.
Then explain how the CREATE application would share the new "movie" with the PLAYBACK application and the EMAIL application.
Then explain how the CREATE application would manage creating 50 or 5000 different movies.
Thanks.
- Kazinsal
- Member
- Posts: 559
- Joined: Wed Jul 13, 2011 7:38 pm
- Libera.chat IRC: Kazinsal
- Location: Vancouver
- Contact:
Re: Why we address RAM directly but use file system for HDD?
PAE? Paging to disk?tom9876543 wrote:Please explain how a 32 bit CPU would CREATE this "movie", if there was no file system.
Same way any other OS that doesn't happen to have 8GB of free memory does it. Load a bit at a time, use it, rinse and repeat.tom9876543 wrote:Then explain how the CREATE application would share the new "movie" with the PLAYBACK application and the EMAIL application.
If your create tool and/or kernel can't handle the logic to do something more than once, you've written a really poor create tool and/or kernel.tom9876543 wrote:Then explain how the CREATE application would manage creating 50 or 5000 different movies.
Re: Why we address RAM directly but use file system for HDD?
You can create an 8 GB partition with no file system on it. Copy your movie to it and you'll see what it's like. (And if you think that using partitions is cheating because a partition table is almost like a minimal file system, you can as well take a whole disk.)tom9876543 wrote:Alright, today you can have an 8 gigabyte MP4 movie file.
In the theoretical "no fileystem" world.....
Please explain how a 32 bit CPU would CREATE this "movie", if there was no file system.
Then explain how the CREATE application would share the new "movie" with the PLAYBACK application and the EMAIL application.
Then explain how the CREATE application would manage creating 50 or 5000 different movies.
If you're lucky, even your video player and email program will cope with whole block devices, never tried this.
-
- Member
- Posts: 170
- Joined: Wed Jul 18, 2007 5:51 am
Re: Why we address RAM directly but use file system for HDD?
1)Blacklight wrote:PAE? Paging to disk?tom9876543 wrote:Please explain how a 32 bit CPU would CREATE this "movie", if there was no file system.
Same way any other OS that doesn't happen to have 8GB of free memory does it. Load a bit at a time, use it, rinse and repeat.tom9876543 wrote:Then explain how the CREATE application would share the new "movie" with the PLAYBACK application and the EMAIL application.
If your create tool and/or kernel can't handle the logic to do something more than once, you've written a really poor create tool and/or kernel.tom9876543 wrote:Then explain how the CREATE application would manage creating 50 or 5000 different movies.
Lets assume its not an x86 processor and there is a 4GB virtual address space.
Paging to disk - can you explain how does the APPLICATION know which sectors to use on the disk? The disk would have to be ORGANISED into a file system?
2)
Person runs the CREATE application and closes it.
Person then runs the EMAIL application.
How do they share data?
3)
OK so the CREATE application has 5000 different movies.... good luck going back to it a week later and finding the one you want to work on.
The CREATE application would have to have its own little database similar to MAC finder.... and then ALL applications would need their own "finder" which is a significant amount of coding effort and they would all be inconsistent.
- Kazinsal
- Member
- Posts: 559
- Joined: Wed Jul 13, 2011 7:38 pm
- Libera.chat IRC: Kazinsal
- Location: Vancouver
- Contact:
Re: Why we address RAM directly but use file system for HDD?
We're treating a HDD as we do RAM here. The kernel has a table of some sort of table internally to keep track of what's used. This would be stored somewhere on the disk for later use and reuse. It's a necessity for any sane organization of storage, RAM or HDD or CD or whatever have you.tom9876543 wrote:Paging to disk - can you explain how does the APPLICATION know which sectors to use on the disk? The disk would have to be ORGANISED into a file system?
Same way as you do with a filesystem, just instead of a file name, use a linear address of a sector. It's not a pretty system, but it works. Alternately, if you don't close create first, then it can pass the address directly to email.tom9876543 wrote:Person runs the CREATE application and closes it.
Person then runs the EMAIL application.
How do they share data?
So... a filename database?tom9876543 wrote:OK so the CREATE application has 5000 different movies.... good luck going back to it a week later and finding the one you want to work on.
The CREATE application would have to have its own little database similar to MAC finder.... and then ALL applications would need their own "finder" which is a significant amount of coding effort and they would all be inconsistent.