Simplest Possible O/S Design

piranha · Post by **piranha** » Fri Nov 28, 2008 4:34 pm

Ok, but i still think that having runable code will make the program more useful and allow more functionality.

Simplifying an existing interface is one approach. However, that does not get rid of the complicated underpinnings. For example, how do we get rid of problems where drivers can hang a system? How do we simplify the programming effort that is involved with the current O/S infrastructure? How do we make it so that a user does not have to know anything of how to traverse a filesystem?

But thats not a problem with a GUI or with overcomplication for the user! Thats poorly written software that should be fixed. And simplifying the programming environment wouldn't help the user - they don't program. I think that a simpler GUI is better, but you don't need to change programming and executing files. Runable programs to me seem much better and wouldn't be too overly complicated if the UI was designed right.

-JL

Troy Martin · Post by **Troy Martin** » Fri Nov 28, 2008 4:42 pm

IMHO the simplest possible design is a flat, linear one. Put everything in a simple order and have everything work in a simple order.

mbluett · Post by **mbluett** » Fri Nov 28, 2008 4:46 pm

Brendan wrote:Hi,

Usually people who say they want to use a database instead of a file system only really want to change the terminology.

For example, on the computer I'm using there's a database called "/dev/sda1" that contains "entries". There's also a "virtual database" called "/" and you can embed other databases into this virtual database by doing something like "mount -t ext2 /dev/sda1" (so that entries in the embedded database look like they're actually in the virtual database). There isn't any file systems (they're all databases) and there isn't any files (they're all entries in the database). Honest...

I think what Mbluett wants is a normal file system with extended file attributes and perhaps some indexing (and different terminology so people don't realize that almost every modern OS already does the same thing).

Cheers,

Brendan

Your assumption of me wanting a normal filesystem" would be incorrect. I have a very comprehensive understanding of filesystems and how they work. I also understand how they can be viewed as a database. However, in comparison to a tru database there are some significant differences.

I am actually proposing the use of a "real" database and a database engine to drive it. Essentially there would be one file (or potentially several files that make up various table structures comprising the database). Inside this database, would exist all kinds of different information stored not necessarily in the form of a conventional file.

The typical databases of today can store files as objects called "Blobs" (Binary Loadable Objects). My proposal would make use of such entities for example where digital images are involved.

As to a textual document, storing this information could be quite different than the way it typically is stored on current O/S's.

Databases are made up of table structures that contain fields of specific types and that typically have specific attributes. The database engines of today typically use SQL to retrieve from and store information to a database.

This is actually what I am proposing.

The reason for this choice is varied:

1. To get rid of the notion of having to understand how to traverse an hierarchical filesystem. Files stored in folders,
stored in folders, etc. The concept is usually easy to grasp. The problem is that the general public have great difficulty
determining where to find a file. In addition, they typically can't remember what the file was called even if they wanted
to do a search. So, they must search within files to find what they are looking for.

If you have ever done this, you will realize that this is a significantly time consuming process. First opening a file,
searching the file, closing the file, moving the heads to the next file, opening it, searching it, closing it.

A database can do this type of thing much faster. It does not have to open and close files. The database is always open
for reads and writes. And it uses indexing to speed up searches. I realize filesystems can be indexed but this does not
alleviate the opening and closing of files which is a very time consuming effort.

A query is passed to the database engine (usually in the form of SQL), the engine then implements the query and
returns the result. Granted as a database grows in size the searches can take longer. However, think of the searches
you do via Google: The responses are very fast. The Google database is probably huge.

2. To create more flexibility in being able to make simple additions to the descriptive information that can be stored with
files in todays O/S's. For example, if I wanted to add a descriptive field to indicate which application created a file it
would be a trivial matter with a database. However, with a conventional filesystem, it would involve re-compiling and
testing the new changes. And what about backward compatibility issues?

You see how much more complicated a conventional filesystem is as opposed to a database?

Owen · Post by **Owen** » Fri Nov 28, 2008 5:06 pm

mbluett wrote:If you have ever done this, you will realize that this is a significantly time consuming process. First opening a file, searching the file, closing the file, moving the heads to the next file, opening it, searching it, closing it.

A database can do this type of thing much faster. It does not have to open and close files. The database is always open for reads and writes. And it uses indexing to speed up searches. I realize filesystems can be indexed but this does not alleviate the opening and closing of files which is a very time consuming effort.

The question should be why is opening and closing a file so slow?
And modern search tools do use a database for searching. They run indexing services in the background

piranha · Post by **piranha** » Fri Nov 28, 2008 5:07 pm

Isn't the windows registry like that and wasn't that viewed as a bad design? I may be wrong, I don't know windows very well.

-JL

mbluett · Post by **mbluett** » Fri Nov 28, 2008 5:14 pm

piranha wrote:
Simplifying an existing interface is one approach. However, that does not get rid of the complicated underpinnings. For example, how do we get rid of problems where drivers can hang a system? How do we simplify the programming effort that is involved with the current O/S infrastructure? How do we make it so that a user does not have to know anything of how to traverse a filesystem?
But thats not a problem with a GUI or with overcomplication for the user! Thats poorly written software that should be fixed. And simplifying the programming environment wouldn't help the user - they don't program. I think that a simpler GUI is better, but you don't need to change programming and executing files. Runable programs to me seem much better and wouldn't be too overly complicated if the UI was designed right.

-JL

I would agree with you entirely, that if the GUI's were redesigned that a lot of these problems would go away. However, can you tell me why they haven't?

For example, why is it still necessary for someone to know how to traverse a folder structure?

If the users do not know how to do this they are severely limited in what they can accomplish.

For example, say they want to attach a digital image file to an email. After clicking on the "Attach" button of your email package, what are you presented with? A filesystem browser box. What does the Novice do now?

With a database and properly designed applications there is no filesystem traversal. Instead, a database search occurs for image relative information, the image (or images) are retrieved, the user selects the proper one, it is attached and the user is all done. That is the way it should work. This type of thing might be available on the MAC, but does not exist in any other O/S I have ever seen. Why hasn't this issue been addressed in Linux or MS O/S's?

My guess is that the mainstream O/S manufacturers just don't care that much about the general public.

If a database had been used for the filesystem, inherently there is no filesystem to traverse and therefore this issue would never have occurred in the first place.

I can think up a lot more examples beyond this one.

JackScott · Post by **JackScott** » Fri Nov 28, 2008 5:16 pm

The Windows registry is a simple tree structure. The reason everybody thinks it's bad is that an application can write anywhere it likes in the registy. Quite a lot of spyware programs go to town on the registry, changing system settings willy nilly. If it was more strict, I'm guessing the spyware problem wouldn't be half as bad.

mbluett · Post by **mbluett** » Fri Nov 28, 2008 5:21 pm

Owen wrote:
mbluett wrote:If you have ever done this, you will realize that this is a significantly time consuming process. First opening a file, searching the file, closing the file, moving the heads to the next file, opening it, searching it, closing it.

A database can do this type of thing much faster. It does not have to open and close files. The database is always open for reads and writes. And it uses indexing to speed up searches. I realize filesystems can be indexed but this does not alleviate the opening and closing of files which is a very time consuming effort.
The question should be why is opening and closing a file so slow?
And modern search tools do use a database for searching. They run indexing services in the background

Opening one file and closing it is not an issue. That is quite fast. However, when you have a search process traverse an entire file system containing 100's of thousands of files watch how long it takes. Each file results in at least two system calls. And each involves moving the heads and reading a file. This is the slowest part of the process.

With a database we have the equivalent of one system call for the entire search. Indexing takes us directly to what we are looking for because the database does not house it's data in a whole bunch of separate files.

mbluett · Post by **mbluett** » Fri Nov 28, 2008 5:33 pm

JackScott wrote:The Windows registry is a simple tree structure. The reason everybody thinks it's bad is that an application can write anywhere it likes in the registy. Quite a lot of spyware programs go to town on the registry, changing system settings willy nilly. If it was more strict, I'm guessing the spyware problem wouldn't be half as bad.

The Windows Registry structure is not really a problem in itself. Even when it is fragmented or has dangling relations it still is very fast. The only thing that might cause a delay is the application that is trying to find something in the registry and doesn't and hence has to resort to some other measure to accomplish what it is trying to do that is if it can even move forward as a result.

It is more a case of who and what has access to it that causes problems. Or if the registry becomes corrupt for some reason. Corruption can be taken care of dynamically in a database, if the database engine is designed properly.

The O/S I am proposing can never have an application like spyware do anything to the database, because the applications come with the system. In fact there is no way for an application to be downloaded, installed and then carry on some nefarious activity.

mbluett · Post by **mbluett** » Fri Nov 28, 2008 5:56 pm

Colonel Kernel wrote:
mbluett wrote:It does not need to be treated as code. I realize how XML can be used, but this O/S does not have to follow the XML guidelines.
Ok, now I'm pretty sure you don't understand XML.

The XML "guidelines" specify syntax, nothing more. You said that your OS will interpret the XML to determine "what library calls to make and how to make use of them". This is about semantics, not syntax; This is treating XML as an interpreted scripting language. Therefore, your OS would treat downloaded XML as code.

I do understand XML. It looks like I failed in conveying that understanding. However, you are right in that the OS "interpreting" the XML would result in specific functions being used and configured. And, I can also see a situation that had not occurred to me before where the XML could describe a nefarious condition.

Perhaps a solution (although maybe not the best) is not to allow any metadata downloads unless it is from a "signed" (trusted) site.

You mean "regulated" like, for example, Apple's iPhone AppStore? That could work, but having one organization of humans check code for bugs isn't scalable. As others have said, this would limit the popularity of your OS because it would limit the number of new capabilities that could be added to it.

I don't know how Apple's online store is regulated.

It may not be scalable, but it must be occurring in regular projects. For example, how come we haven't see a developer of something like Firefox inject a virus yet?

There would not be a limit to new capabilities. Look at the evolution of Firefox and Linux, to name a few: New features get added quite quickly, and this is still fairly controlled code.

Have you ever seen the UAC prompts in Windows Vista? Or any other nagging dialogs like "Are you SURE you want to open this attachment that claims to be a celebrity sex video?" Users become trained very quickly to just ignore the warnings and click OK.

Yes, I agree and this is something that needs to be avoided.

Simplicity of implementation does not give you security, and as long as computers talk to each other over the Internet, you will need security.
This is true. However, security could be had by sand-boxing any application that communicates to the outside world. Would this not be adequate?
Yes, but then the apps become mostly useless since they can't access any local resources on the machine.

Which brings up another question I have... How would a person browse the web on your OS? Would Javascript be allowed? If not, your idea is already stillborn since everyone and their dog does almost everything online these days.

[/quote]

They could still access the local database for storing information.

Javascript would have to be allowed and that is why sand-boxing is necessary. I believe that is one of the features MS added to Vista to help control browsers. However, it may only work with their browser.

Owen · Post by **Owen** » Fri Nov 28, 2008 5:57 pm

mbluett wrote:
Owen wrote:
mbluett wrote:If you have ever done this, you will realize that this is a significantly time consuming process. First opening a file, searching the file, closing the file, moving the heads to the next file, opening it, searching it, closing it.

A database can do this type of thing much faster. It does not have to open and close files. The database is always open for reads and writes. And it uses indexing to speed up searches. I realize filesystems can be indexed but this does not alleviate the opening and closing of files which is a very time consuming effort.
The question should be why is opening and closing a file so slow?
And modern search tools do use a database for searching. They run indexing services in the background
Opening one file and closing it is not an issue. That is quite fast. However, when you have a search process traverse an entire file system containing 100's of thousands of files watch how long it takes. Each file results in at least two system calls. And each involves moving the heads and reading a file. This is the slowest part of the process.

With a database we have the equivalent of one system call for the entire search. Indexing takes us directly to what we are looking for because the database does not house it's data in a whole bunch of separate files.

And why not do the indexing in userspace? Stuff like Linux' Beagle already does this - and it's as efficient at doing the indexing as it can get since it uses the kernel's inotify mechanism to inform it of file changes. The only time it has to do a full disk traversal is when it's first installed.

Of course, even the non-technical folks I know can manage their files and folders well enough to not need search tools. And if you remove that padagrim you just broke mounds of existing data. Perhaps the current system is not ideal, but is the replacement cost worth it?

Brendan · Post by **Brendan** » Fri Nov 28, 2008 11:29 pm

Hi,

Colonel Kernel wrote:
Brendan wrote:Of course I should probably mention that nobody really uses extended attributes, because (for e.g.) users have better things to do than make up thousands of descriptions for thousands of pictures that their wife doesn't know about. Basically extended attributes sound like a great idea until you try to figure out where the data for these extended attributes is meant to come from.
I like your examples! In practice, people are more patient than you think. Even so, it helps when the source of metadata can be automated somehow. For example, CDDB can be used to automatically download metadata for music. I'm sure a similar database service exists for movies. I believe that a real "killer app" would be an AI that is able to take a few example photos and auto-tag the rest by learning what people and places look like, what clues in the picture suggest an approximate date (e.g. -- snow, leaves, holiday decorations, sunny beaches, etc.) It wouldn't have to be exact, just good enough to save people from doing most of the grunt work themselves.

If the metadata can be obtained automatically, then it could be obtained automatically when it's needed and not stored anywhere. In this case the extended attributes would cache the metadata (e.g. to improve the time it takes to obtain) rather than adding something new.

I'm also wondering how you'd avoid problems with context. For example, a friend takes some photos with a phone and sets the metadata to "My family on holiday", and then sends these pictures to you (including the metadata). Now you've got pictures that say "My family" when it's not your family at all. There's also a major internationalization issue here - e.g. metadata in one language isn't going to be useful to someone who doesn't understand that language.

mbluett wrote:
Brendan wrote:Usually people who say they want to use a database instead of a file system only really want to change the terminology.
Your assumption of me wanting a normal filesystem" would be incorrect. I have a very comprehensive understanding of filesystems and how they work. I also understand how they can be viewed as a database. However, in comparison to a tru database there are some significant differences.

I am actually proposing the use of a "real" database and a database engine to drive it. Essentially there would be one file (or potentially several files that make up various table structures comprising the database). Inside this database, would exist all kinds of different information stored not necessarily in the form of a conventional file.

I'm proposing that a plain old boring file system (e.g. FAT) *is* a kind of database and code that supports a plain/boring file system is a kind of database engine.

I'm looking for things that make your database different to a normal filesystem, and reasons why it's better.

mbluett wrote:Databases are made up of table structures that contain fields of specific types and that typically have specific attributes. The database engines of today typically use SQL to retrieve from and store information to a database.

You could add an "SQL style query" front-end to any normal file system - it changes how people find the files, not the way those files are stored.

mbluett wrote:1. To get rid of the notion of having to understand how to traverse an hierarchical filesystem. Files stored in folders,
stored in folders, etc. The concept is usually easy to grasp. The problem is that the general public have great difficulty
determining where to find a file. In addition, they typically can't remember what the file was called even if they wanted
to do a search. So, they must search within files to find what they are looking for.

I'm the opposite - a hierarchical filesystem helps me find exactly what I'm after. For example, for OS development information I've got a large directory called "info" that contains hundreds of files, but this directory has many subdirectories (e.g. "video", "network", "CPU", etc). Using the hierarchical filesystem, if I want to find some errata for Pentium 4 CPUs I can find it fast because it's in the "/info/CPU/Intel/Pentium4/errata" directory. Without a hierarchical filesystem I'd be forced to use searches, which can be slow but more importantly aren't very precise - for example, maybe I use the word "netburst" in my SQL query and find nothing because I forgot the metadata called these files "Pentium 4" instead.

Of course adding an "SQL style query" front-end to a normal file system (and using indexing for the metadata to speed it up) would provide the best of both methods - people that aren't smart enough to store their files an an organized directory structure could open the "search" dialog box, try a few different searches until they get the query right, and then try to find the file they actually want from the list of files the query shows them (and smart people can just go directly to the file they want in their organized directory structure).

mbluett wrote: A query is passed to the database engine (usually in the form of SQL), the engine then implements the query and
returns the result. Granted as a database grows in size the searches can take longer. However, think of the searches
you do via Google: The responses are very fast. The Google database is probably huge.

Google is very fast because they're using over 450000 servers in parallel. You enter a query and the front-end sends that query to lots of computers, and each of these computers searches a tiny fraction of the database and returns it's results, then the results from all of the computers are combined. Unfortunately, most people don't have that many computers (and even if they did they wouldn't want to keep them running all the time just so they can find files).

mbluett wrote:2. To create more flexibility in being able to make simple additions to the descriptive information that can be stored with
files in todays O/S's. For example, if I wanted to add a descriptive field to indicate which application created a file it
would be a trivial matter with a database. However, with a conventional filesystem, it would involve re-compiling and
testing the new changes. And what about backward compatibility issues?

This can be done with the extended attributes that most OSs already support. You don't need to recompile anything just to add a new attribute (although for Linux almost everyone would need to enable support for extended attributes and recompile the kernel, because this sort of thing is so useful that almost everyone doesn't bother enabling it). Windows/NTFS always supports it.

The problem with existing systems is that most applications don't support extended attributes, there's often no tools to index and/or search for files based on information in the extended attributes, and there's no standards to say how extended attributes should be used. For example, I could write an application for Windows or Linux that creates an "application = my_application_name" extended attribute, and other applications might create a "program = my_application_name" extended attribute, and some applications might set a "my_application_name = yes" attribute; and nobody will know what to search for. Of course the same can happen for your database - e.g. my application creates a field in the database called "application" and sets this field to "my_application_name" for it's files, while other people's applications create different fields for the same purpose, and end users have no idea which database field/s they need to search in their query.

To avoid that problem you'd need to have some sort of specification that describes "standard fields" and their intended purpose (so all software uses the database fields in a compatible way), but if you're going to do that you might as well just add the "standard fields information" to the directory entries (e.g. filename, file size, file permission flags, file owner, file description, keyword list for the file, application that created the file, application that modified the file last, etc).

Cheers,

Brendan

JackScott · Post by **JackScott** » Fri Nov 28, 2008 11:51 pm

Perhaps one thing we could take from this is that it doesn't really matter what standards we create, just that we actually stick to the ones that we do create?

<$0.02>
If I may, I'd like to whinge a bit now about the way Windows application programmers behave. The Windows team create a fairly nice API: It's nothing special, but it works, and it provides most things that an application is likely to need. Most of all, it's consistent; each element blends in well with the other elements. This is especially true of the GUI, and how it looks. So what do the application programmers do? They make up their own functions, their own GUI elements, all specific to their application, and unusable by everybody else. IMHO, one of the biggest violations of the 'standard' API is MS's own software suite, Office '07. Noticed how it fits in perfectly with Vista? Very good. Now run it on Windows XP. It still looks like it did on Vista: it doesn't fit in with the XP themes. Now why the hell did they do that?
</$0.02>

piranha · Post by **piranha** » Sat Nov 29, 2008 12:02 am

It still looks like it did on Vista: it doesn't fit in with the XP themes. Now why the hell did they do that?

They did it to make Vista look more appealing. It's annoying to use it on XP (actually, its annoying to use it period).

You are right. I really hate how most programs use their own UI designs and themes. If Windows is aimed at the average user (as it should be, isn't that the idea?) it should be simple enough to use and the the programs should use the same UI designs.

-JL

JackScott · Post by **JackScott** » Sat Nov 29, 2008 12:05 am

And it looks even worse using the 'Windows Classic' (2000) theme. Select few applications are tested on that theme, I think.

OSDev.org

Simplest Possible O/S Design

Re: Simplest Possible O/S Design

Re: Simplest Possible O/S Design

Re: Simplest Possible O/S Design

Re: Simplest Possible O/S Design

Re: Simplest Possible O/S Design

Re: Simplest Possible O/S Design

Re: Simplest Possible O/S Design

Re: Simplest Possible O/S Design

Re: Simplest Possible O/S Design

Re: Simplest Possible O/S Design

Re: Simplest Possible O/S Design

Re: Simplest Possible O/S Design

Re: Simplest Possible O/S Design

Re: Simplest Possible O/S Design

Re: Simplest Possible O/S Design