Why we address RAM directly but use file system for HDD?

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
tom9876543
Member
Member
Posts: 170
Joined: Wed Jul 18, 2007 5:51 am

Re: Why we address RAM directly but use file system for HDD?

Post by tom9876543 »

Blacklight wrote:
tom9876543 wrote:Paging to disk - can you explain how does the APPLICATION know which sectors to use on the disk? The disk would have to be ORGANISED into a file system?
We're treating a HDD as we do RAM here. The kernel has a table of some sort of table internally to keep track of what's used. This would be stored somewhere on the disk for later use and reuse. It's a necessity for any sane organization of storage, RAM or HDD or CD or whatever have you.
tom9876543 wrote:Person runs the CREATE application and closes it.
Person then runs the EMAIL application.
How do they share data?
Same way as you do with a filesystem, just instead of a file name, use a linear address of a sector. It's not a pretty system, but it works. Alternately, if you don't close create first, then it can pass the address directly to email.
tom9876543 wrote:OK so the CREATE application has 5000 different movies.... good luck going back to it a week later and finding the one you want to work on.
The CREATE application would have to have its own little database similar to MAC finder.... and then ALL applications would need their own "finder" which is a significant amount of coding effort and they would all be inconsistent.
So... a filename database?
1) Treating the HDD the same as RAM? Are you sure about that? That would say to me the process (application) uses ONLY memory pointers to access "memory" and has no concept of disk. Now what happens when the pointer is 32 bits and the file is larger than 4gb?

2) That would NOT work. How does the EMAIL application even know what movies the CREATE application has created? Apparently the CREATE application will have its own internal list of movies. The CREATE application has closed so the EMAIL application has no idea about the internals of the CREATE application.

3) You seem to agree it is stupid for each APPLICATION to have its own MAC finder / filename database. I think iTunes is a good example of an application having its own database and not really requiring a file system. But is it practical for EVERY application to have its own different interface to manage documents?
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Why we address RAM directly but use file system for HDD?

Post by Brendan »

Hi,

In both cases you've typically got something to identify the data, and the data itself. Humans naturally prefer to use names as identifiers (e.g. "hello.txt" or "char *myTextBuffer"). Computers naturally prefer to use numbers as identifiers. To make both people and computers happy you have "name to number" conversion.

For file systems the "name to number" conversion means using a file name to find a location on disk. For programming languages the "name to number" conversion means using a symbol (e.g. variable name) to find a virtual address.

The main difference is when this conversion occurs. For file systems the conversion typically occurs when the file is opened; and for programming languages the conversion typically happens when you compile the source code.

Of course programming tends to use a lot of indirection - e.g. "the data that is pointed to by the data at myName", where only the pointer has a name and the data the pointer points to doesn't. In theory file systems could do this too (e.g. a symbol link with a name that points to a second file that has no name) but I doubt anybody has ever actually wanted to bother with unnamed files.

The other difference is security. Some data is "owned" by only one process and no security is needed, and some data you want potentially many processes to be able to access and some sort of security is needed. Where some sort of security is needed you need some sort of permission check where the data can't be accessed if permission is denied (and if you want you could replace "open(fileName)" with "check_if_I_have_permission(fileName)").

Basically you end up with 2 completely different things for completely different purposes - something where names/identifiers can be converted into numbers in advance, having data without a name can make sense and no security is needed; and something where the names/identifiers can't be converted into numbers in advance, having data without a name doesn't make much sense and security is needed.

Note that this has nothing to do with disk vs. RAM - you can have swap space, and you can have file systems in RAM. However, different types of hardware have different characteristics - RAM is typically faster than disk, RAM is typically much smaller than disk and RAM is typically volatile. Because of these different characteristics, it makes sense to use faster/smaller/volatile RAM for software and slower/larger/non-volatile storage for file systems.

Some time in the next 5 years I'm expecting non-volatile RAM to become an economically viable option. If/when this happens it may make sense to shift (at least some) file systems into RAM (e.g. files that are accessed often). Even if all data could be in non-volatile RAM (which is unlikely due to size differences) you'd still want a file system due to the different usage (name lookup and permission checks on "open()"). Offtopic note: economically viable non-volatile RAM would be an interesting thing for OS research that could radically change the way (some) OSs are designed; but in the end I suspect that it'll only be used for "hibernate" and persistent VFS caches and there won't be any significant change to OS design.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
linguofreak
Member
Member
Posts: 510
Joined: Wed Mar 09, 2011 3:55 am

Re: Why we address RAM directly but use file system for HDD?

Post by linguofreak »

Brendan wrote:Offtopic note: economically viable non-volatile RAM would be an interesting thing for OS research that could radically change the way (some) OSs are designed; but in the end I suspect that it'll only be used for "hibernate" and persistent VFS caches and there won't be any significant change to OS design.
Other than economics, aren't there also problems with write speeds for most modern non-volatile RAM being considerably slower than read speeds?
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Why we address RAM directly but use file system for HDD?

Post by Brendan »

Hi,
linguofreak wrote:
Brendan wrote:Offtopic note: economically viable non-volatile RAM would be an interesting thing for OS research that could radically change the way (some) OSs are designed; but in the end I suspect that it'll only be used for "hibernate" and persistent VFS caches and there won't be any significant change to OS design.
Other than economics, aren't there also problems with write speeds for most modern non-volatile RAM being considerably slower than read speeds?
As far as I can tell, Everspin's ST-MRAM is as fast as normal DRAM. Their problem seems to be a lack of motherboard and OS support, leading to a lack of demand, leading to a lack of volume, leading to higher prices and lower densities (90nm process).

Basically it's already a viable alternative to volatile RAM in the technical sense, it's just not a viable alternative in the economic sense.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Why we address RAM directly but use file system for HDD?

Post by rdos »

I have an interface (for user mode) to do raw disc read/write operations. For one, it is used to be able to change/add/delete partition tables when installing RDOS on a machine. I also use it for data storage. I have a storage class which is backed on disc, which consists of a list of items of the same size which typically is C structures. I use fixed sectors (from the end of the disc) for this. In order for this scheme to work, I typically leave a number of sectors at the end of the disc unpartitioned. I feel this is a lot safer as the data is not lost if the file system becomes corrupt. I plan to implement long-term storage (MID regulations) using this method.
amn
Posts: 23
Joined: Fri Oct 03, 2008 10:14 am

Re: Why we address RAM directly but use file system for HDD?

Post by amn »

Kevin wrote: That's the job of the OS. It usually associates a memory context with a running process, that has a name, creation date, etc. All of this metadata is stored somewehere in RAM, like filesystem metadata is stored on disk.
As far as I know, there is no "name", "creation date" or anything like that associated with process heap. Perhaps on the more exotic systems, but not on vanilla Linux systems. Such metadata, even though necessary and present to some degree, is an order og magnitude more rudimentary than what a file system has. There is certainly no transaction support, or versioning facilities for memory heaps, as is often the case with modern filesystems.
So what is a file system?
That was huge part of my whole point - unlike simpler memory allocation and management, a file system, in addition to providing resource security and protection, unfortunately(?) imposes a layer of abstraction for the sake of simplicity. Resource multiplexing and protection are absolutely essential in any multi-user multi-tasking OS, but what we get on top with file systems is not strictly necessary. It's more of a tradition, really. We don't need journaling on disk any more than we need it in memory (which is not to say we don't need it, in fact I think we do).
Can you do without malloc()? I guess so. Would it be a good idea? Probably not. It's the same for using raw disks without a file system.
I didn't say we don't need `malloc()`. My argument was about not needing `open` requiring a filename with a path for on-disk data. Allocation is necessary a multi-user multi-tasking OS, it lets the resource be shared among concurrent clients. The added layer of abstraction dealing with files however, is not a necessity, it's a luxury, a commodity, a privilege, all depending on your application and yourself. In any case, when I spoke of memory- vs disk- access, I did not want to focus on allocation, but on access. I mean, why don't we write directly to bytes and bits we allocate on the disk? And why don't we [typically] use filenames, transactions, and journaling when dealing with RAM? :-)
Kevin
Member
Member
Posts: 1071
Joined: Sun Feb 01, 2009 6:11 am
Location: Germany
Contact:

Re: Why we address RAM directly but use file system for HDD?

Post by Kevin »

amn wrote:As far as I know, there is no "name", "creation date" or anything like that associated with process heap.
Not with the memory context (it's more than just the heap) itself, but with the process to which the memory context belongs.
Such metadata, even though necessary and present to some degree, is an order og magnitude more rudimentary than what a file system has. There is certainly no transaction support, or versioning facilities for memory heaps, as is often the case with modern filesystems.
Metadata exists in both cases, so it's not a fundamental difference. You could add the missing metadata from the filesysem to a memory context with no major problems, and vice versa.

And of course some kind of transactions is required on RAM. Basically anything you do for thread synchronisation belongs to that category, starting with simple locks. RCU comes very close to the traditional understanding of transactions. And databases implement the real thing anyway, which obviously includes data in RAM.

One difference to that respect, however, is that RAM is volatile by nature. You don't have to worry about making data in write-back caches persistent (and doing that in the right order), because it simply won't become persistent in RAM. This is certainly one of the points that make file systems complex.
We don't need journaling on disk any more than we need it in memory (which is not to say we don't need it, in fact I think we do).
Journaling is mostly related to the problems with making things persistent. You don't have that problem in memory. What's your use case there? (It's not really required on disks either, but it's a common way to implement things)
My argument was about not needing `open` requiring a filename with a path for on-disk data.
Okay. This is not what defines a file system in my book. Having a directory tree, where each directory has entries that have a human readable name is convenient, but not the defining property of a file system. If you took ext2, and removed the directory entries from it, so that you would have to identify files by their inode number, that would be quite unconvenient, but it would still be a file system.

The purpose of a file system is managing which blocks are allocated to which file (where files don't have a fixed size). This is very much like what malloc() is doing for memory.
And why don't we [typically] use filenames, transactions, and journaling when dealing with RAM? :-)
We do use names for anything that the user is expected to deal with. Never seen a C struct that had a field char* name? Names are usually not used for purely internal objects, like file systems typically don't allow to access their metadata using a file name.

Transactions, like I said above, are definitely used on RAM; journaling probably as well as one option of implementing transactions.
Developer of tyndur - community OS of Lowlevel (German)
User avatar
dozniak
Member
Member
Posts: 723
Joined: Thu Jul 12, 2012 7:29 am
Location: Tallinn, Estonia

Re: Why we address RAM directly but use file system for HDD?

Post by dozniak »

Learn to read.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Why we address RAM directly but use file system for HDD?

Post by Owen »

Brendan wrote:As far as I can tell, Everspin's ST-MRAM is as fast as normal DRAM. Their problem seems to be a lack of motherboard and OS support, leading to a lack of demand, leading to a lack of volume, leading to higher prices and lower densities (90nm process).
DRAM is never produced on leading process nodes (insufficient margin in a highly commoditised market); some manufacturers are only just rolling out their 30nm DRAM process (which places them 2 years behind high end bulk semiconductor, e.g. CPUs)

In other words: the gains may not be as great as predicted (Memory on a high end bulk semiconductor process will be expensive no matter which way you slice it)
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Why we address RAM directly but use file system for HDD?

Post by Brendan »

Hi,
Owen wrote:
Brendan wrote:As far as I can tell, Everspin's ST-MRAM is as fast as normal DRAM. Their problem seems to be a lack of motherboard and OS support, leading to a lack of demand, leading to a lack of volume, leading to higher prices and lower densities (90nm process).
DRAM is never produced on leading process nodes (insufficient margin in a highly commoditised market); some manufacturers are only just rolling out their 30nm DRAM process (which places them 2 years behind high end bulk semiconductor, e.g. CPUs)

In other words: the gains may not be as great as predicted (Memory on a high end bulk semiconductor process will be expensive no matter which way you slice it)
If some DRAM manufacturers are just rolling out 30 nm; then would it be safe to assume that Everspin could switch from a 90 nm process to a 65 nm or 45 nm process (if they had enough volume)?

I'm only going by the article/s; which seem to say "dies sizes are roughly the same as DRAM for the same capacity chip" and "timings are also comparable to DRAM" and "takes less power to switch and to run" (no refresh).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
amn
Posts: 23
Joined: Fri Oct 03, 2008 10:14 am

Re: Why we address RAM directly but use file system for HDD?

Post by amn »

Guys, please don't hijack the thread with discussions on DRAM. I wish you great success discussing it somewhere else though ;-) Humble regards!
User avatar
trinopoty
Member
Member
Posts: 87
Joined: Wed Feb 09, 2011 2:21 am
Location: Raipur, India

Re: Why we address RAM directly but use file system for HDD?

Post by trinopoty »

Back to tropic.
Even if you store data like if it was RAM, you need some way to know where things are. What if a program stores data at LBA 0x2000 and another program later modifies the data thinking it is it's own. Now, if the program that originally wrote the data tries to read it, it gets wrong data.
You need to keep track of data on disks as well as in RAM. We keep track of data on disk using File Systems. We keep track of data in RAM using various data structures. "File System" is just a term for the mechanism used to keep track of data on disk. Even if you abandon all file systems, you will develop/need a way to keep track of data on disk and everyone else will call it File System. Databases are similar to file systems but they remove some of the unnecessary stuff from common file systems; like creation date, etc, etc. If a database was to be written to a disk or partition, and a driver is installed in the OS to read the database, it will function just like any other file system.

Regards,
Trinopoty
Always give a difficult task to a lazy person. He will find an easy way to do it.
amn
Posts: 23
Joined: Fri Oct 03, 2008 10:14 am

Re: Why we address RAM directly but use file system for HDD?

Post by amn »

trinopoty wrote:Back to tropic.
What if a program stores data at LBA 0x2000 and another program later modifies the data thinking it is it's own.
I have never seen or heard anyone being advised to read or write unallocated data. The way it has traditionally been done is that one asks the allocator for memory, and gets a pointer back, knowing the stretch of the area. The process memory allocator keeps track of what is what in the heap, and another process cannot safely read or write data - a segmentation fault will be raised in hardware, courtesy x86 protected mode. If two processes want to share memory they use some form shared memory API. Now all this of course pertains to the more traditional POSIX systems.

The scenario you described is no more or less applicable to a system I had described earlier (where disk is accessed using same abstraction that memory uses)

Am I missing something?
User avatar
thepowersgang
Member
Member
Posts: 734
Joined: Tue Dec 25, 2007 6:03 am
Libera.chat IRC: thePowersGang
Location: Perth, Western Australia
Contact:

Re: Why we address RAM directly but use file system for HDD?

Post by thepowersgang »

Probably the best answer to this is "We don't access RAM directly"
There's always some form of high level addressing scheme to memory (of any type). On disk, it's usually a "file system" (a collection of folders containing other folders or files). In memory, it's layers of allocators and the final addresses are stored at certain symbols (e.g. a linked list with a global head pointer).

You can choose to give over an entire disk to an application (use it as a large, fixed-size file), just the same as you can have a filesysem in RAM. It's a different tool for a different job.

A side note is that each "sorting" method is designed for the medium. File systems are designed around disks (usually) which have non-zero seek times, RAM layouts don't have to care about that. RAM locations need to be understood and quickly navigated by machine, while disk locations are usually meant to be human readable.
Kernel Development, It's the brain surgery of programming.
Acess2 OS (c) | Tifflin OS (rust) | mrustc - Rust compiler
Currently Working on: mrustc
rdos
Member
Member
Posts: 3276
Joined: Wed Oct 01, 2008 1:55 pm

Re: Why we address RAM directly but use file system for HDD?

Post by rdos »

thepowersgang wrote:Probably the best answer to this is "We don't access RAM directly"
There's always some form of high level addressing scheme to memory (of any type). On disk, it's usually a "file system" (a collection of folders containing other folders or files). In memory, it's layers of allocators and the final addresses are stored at certain symbols (e.g. a linked list with a global head pointer).
There is always (?) a physical disc sector layer in the OS, which is used by file systems. If the OS allows applications access to this layer, it is quite possible to treat disc in the same way as RAM. The only difference is that the application needs to call some syscall (or rely on modified bits of pages) in order to write back contents to disc.
Post Reply