Unconventional design: no processes and files?

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
User avatar
unpredictable
Posts: 3
Joined: Wed Oct 10, 2007 11:39 pm
Location: Bangalore, India
Contact:

Unconventional design: no processes and files?

Post by unpredictable »

Hi all,

Am quite new to 'OS development' as such, but have been reading the osdev wiki and other sources since about 5 months now. Before beginning the real implementation of my OS, I really want to first work out a good design of all aspects.

Some thoughts keep occuring to me, like what if we have an OS without processes/threads and files? Instead of all that, we'll have some other mechanism to get the job done. A BAD replacement can be all we have is objects, some of which have data and other have functions, and they execute in turn when required to fulfil the purpose. Though this is a very bad alternative.

Anyways, the point is, I want to get an opinion whether having processes, pipes, files, etc. etc. (which is common to all *nix platforms and pretty much Windows) is a hardcoded neccessity considering currently available hardware (like intel 80386)? Or is it that all OSes follow the same design because of de facto standard set by Unix?

It is very difficult to even think something other than files for storing data like mp3 or even C code. But still I wish if we can debate that there are alternatives to files too! Some other way of storing information, and some other way of running the whole system too.

For further reading on my own wiki for it all : http://myos.scribblewiki.com/Brainstorms

Comments?
_UnPrEdictAbLe_
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Post by AJ »

Hi,

You have obviously been thinking hard about osdev - good plan with designing before implementing - a lot of people (myself included) have fallen due to lack of design in the past.

On the topic of processes / threads, as it stands, I think we are pretty much stuck with them on the x86 at the moment. Have a look at this topic, which you may find interesting. The current tasking system on the x86 (TSS's, whether you use a multiple or single TSS system) is built in at a fairly fundamental level from the privilege control point of view.

As far as files are concerned, do you mean a sort of Database File System? Again - have a look at this topic. It seems to me that this is simply another level of abstraction built in to the kernel (not saying that's necessarily a Bad Thing). The reason that most OS's seem to use the concept of files internally is because it works. If you try to get rid of this without a well designed alternative, you are in danger of re-inventing the wheel - inventing a 'File System' in everything but name.

The other reason so many hobby OSes follow the *nix way fo doing files, is that this shortens the time from the start of the project to going self-hosting, as you can simply port an existing compiler to your OS (with or without an additional abstraction layer).

Sorry for my stick-in-the-mud opinions on this(!), but I think that at the moment, file systems and processes/threads are the best we can have for doing the job on current hardware.

Cheers,
Adam
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Post by Combuster »

Anyways, the point is, I want to get an opinion whether having processes, pipes, files, etc. etc. (which is common to all *nix platforms and pretty much Windows) is a hardcoded neccessity considering currently available hardware (like intel 80386)? Or is it that all OSes follow the same design because of de facto standard set by Unix?
I wouldn't say so. Unix has pretty medieval roots and from that come its paradigms.

Basically what a processor does is execute code. The OS manages what code needs to be run. In multitasking we usually have several tasks that work simultaneously. Unix simply calls each such task a process and be done with it.
However you can also consider the processor as a unit that is connected to memory which stores instructions. At this lower level, you can control address spaces provided by the MMU and load an arbitrary number of 'tasks' in each one. Efficiently doing so requires that there be no processes in the unix sense since a process has its own address space. Instead we provide address spaces and execution threads.
Somewhere down this forum is a concept of continuation-based systems, which is based not upon storing the current state of a task, but rather what is left to be done. AJ already posted a link.

Files are an metaphor of the boxes with paper in the office. They are a useful abstraction when dealing with data, but hardware devices are hardly a piece of paper, although they seem to be in *nix systems.
An exokernel would not provide files, but disk blocks. Many would still abstract the file on top of it, but it does not mean that you have to do so. The reason that files are hardly done without is that they are close to a minimal set already: a bunch of data and the information attached to it.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
mystran
Member
Member
Posts: 670
Joined: Thu Mar 08, 2007 11:08 am

Post by mystran »

I'm struggling to build a fully event-driven system, where almost nothing will ever block. That includes things like connect() or file I/O, which are normally thought of as "near instant".

In a sense it's "return to roots" design... a program running on bare hardware will never have to stop waiting for something, unless the programmer decided there was nothing else to do... in my OS, I want to leave that choice to the applications.

If you think of it, that's not exactly as trivial as it sounds. It means for example, that you can't complete an "open()" call synchronously, because it might have to go to disk (or worse, network) in order to resolve the filename you gave it. So instead of returning the descriptor (or error code) as a return value of the request, the same info has to be sent as an asynchronous event afterwards.

The hope is to create a system where it's possible to create software that can always cancel what it's currently doing if the user so desires, instead of having to wait for some arbitrary timeout, and on the other hand, hopefully eliminate as many arbitrary timeouts as possible.
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.
jal
Member
Member
Posts: 1385
Joined: Wed Oct 31, 2007 9:09 am

Post by jal »

mystran wrote:It means for example, that you can't complete an "open()" call synchronously, because it might have to go to disk (or worse, network) in order to resolve the filename you gave it. So instead of returning the descriptor (or error code) as a return value of the request, the same info has to be sent as an asynchronous event afterwards.
I've been thinking along the same ways. But if you are, say, an MP3 player, and you've been just given the command to play an MP3 file, there's little use in not waiting for the result of the Open call. Making the Open call always asynchronous (and not wrapping it up in some higher level abstraction implementation making it synchronous) doesn't make programming life easier, I think.


JAL
jal
Member
Member
Posts: 1385
Joined: Wed Oct 31, 2007 9:09 am

Re: Unconventional design: no processes and files?

Post by jal »

unpredictable wrote:A BAD replacement can be all we have is objects, some of which have data and other have functions, and they execute in turn when required to fulfil the purpose. Though this is a very bad alternative.
An object is nothing more than an abstraction as well, there's nothing inheritly bad about it. Since users do not like being presented with random bits, you must have some form of abstraction to present to the user. If you are clear about that, then you can design your FS to either be a direct implementation of that abstraction, or let it be a simple block allocator and build your abstraction on top of that.


JAL
User avatar
unpredictable
Posts: 3
Joined: Wed Oct 10, 2007 11:39 pm
Location: Bangalore, India
Contact:

Post by unpredictable »

I agree AJ that the conventional way will cut a lot of time in making things work and will also help the developer get a lot of support from other programmers as it'll be familiar to them. It'll also be very convinient to port applications and get a lot of fun in the OS quickly.

But here, I am not trying to act like a hobby osdever trying to get a my-os-also-runs-firefox feeling. Its because people know that a hobby *nix based OS works, everyone can make that. Neither am I trying to be like Microsoft or Redhat who want to stick to their user-base by not doing something entirely different.

Yeah, combuster, seems like FILES are a very basic concept and can't be done away with. But here, I am talking about why do we want to deal with the concept of data and metadata? Why not something else?

Mystran comes really close to my thinking in the realm of making things event driven. Well, yeah, all hardware is almost even driven, then why is software not? Why do want to sequence code when actually every instruction occurs at clock ticks? Having a non-blocking OS is a really good concept, but I am looking for something even more abstract and not fundamental to the KNOWN concepts.

Yeah well, object is an abstraction over processes, thats true. Thats why I said its a bad replacement. I don't have an idea as to what will be the entities in this new design.

Also, another good concept is related to 'chroot's in *nix platforms. Extending the concept, we can have all applications running in separate roots (which can even be virtual roots not existent in the filesystem).

To give you an example, why should firefox be able to see my mp3 files or important documents. And what has mplayer to do with my mailbox data? If applications support a limited number of files, and we have only a few instances of sharing same formats among applications, we can have some way of storing files in way that each application stores what it wants/processes.

Basically, I wanted to say that we might keep the 'file' concept, but the 'filesystem' concept, with visibility for everything to everyone (if you have read permission) is an idea that can be tweaked. I still don't know if this allows us to do away with having 'directories'.

The 'process' concept I've still not got a replacement. Its still impeccable. But whatever thoughts I keep getting, I'll keep sharing with you guys.

Cheers!
_UnPrEdictAbLe_
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

But here, I am talking about why do we want to deal with the concept of data and metadata? Why not something else?
I really don't understand - "data" is a name for a stream of numbers. Users want to put numbers in and get those same numbers out again. The entire point of a computer is data manipulation - what could you possibly replace it with?

It seems to me that you're trying to abstract beyond all recognition, and imho that can only end badly. If you came up with some *examples*, I may think differently. :)
User avatar
stevenup7002
Member
Member
Posts: 60
Joined: Tue Sep 20, 2005 11:00 pm
Location: Ireland
Contact:

Post by stevenup7002 »

It's good to see that you're thinking hard about OSdev, but this is practically the way most computers have run since 1939. I just think we're not ready for what you've described, sounds a bit like something we'll see with quantam computing, who knows. It's a great idea though, just hard to imagine.
jal
Member
Member
Posts: 1385
Joined: Wed Oct 31, 2007 9:09 am

Post by jal »

unpredictable wrote:Basically, I wanted to say that we might keep the 'file' concept, but the 'filesystem' concept, with visibility for everything to everyone (if you have read permission) is an idea that can be tweaked. I still don't know if this allows us to do away with having 'directories'.
Sure, directories are only for hierarchical storage, based on a tree hierarchy. In the OS I'm currently designing, files do exists as a collection of data. Instead of directories I have 'name spaces', each 'file' can be in one or more name spaces, and processes are limited to certain name spaces, depending on various criteria. I totally agree with you that it is bogus to have a media player being able to access your word processor documents.


JAL
elfenix
Member
Member
Posts: 50
Joined: Sun Dec 02, 2007 1:24 pm
Libera.chat IRC: elfenix
Location: United States
Contact:

Post by elfenix »

Ok, I'm trying to understand the railing against "processes" and "threads" as an abstraction...

Outside of the x86 world, designs like you are discussing exist everywhere. Back in the days of DOS, Windows 3.1 did not have preemptive multitasking - all your apps where event driven and while waiting for events the OS would regain control and let someone else run that needed to. It worked about as well as could be expected at the time, and then Microsoft moved on, for good reason. This sort of design has it's place, and is even still used a lot in the embedded world.

Given that multi-core CPUs are becoming standard now, you have no choice as an operating system designer BUT to have some form of threads/process - even if it is just one dedicated thread/process per CPU. Add into the mix, that a lot of the time you'll spend on the CPU is 'waiting' for hardware to do stuff. What do you do while you're MP3 player waits on disk IO? You switch to the GUI to update the display? *Wham* You just invented non-preemptive multitasking. Most kernels have the capability to operate in this mode - for example - Linux! Just perform a sched_setscheduler for each 'process' you start, set the priority to SCHED_FIFO.... (Be warned, you will have problems if you try to multitask this way...)

Unless you want to only perform 1 task on a computer, you're going to have multitasking. Arguing against "threads/processes" abstraction in such an environment is like saying you want to eat a vegetarian beef steak.... You can argue against preemptive multitasking, but then you're rather ignoring the pre-1995 time frame for single core CPUs. If you start looking at a massively parallel system (say 128 cores), the idea of non-preemptive threading starts becoming interesting again.....
User avatar
bewing
Member
Member
Posts: 1401
Joined: Wed Feb 07, 2007 1:45 pm
Location: Eugene, OR, US

Post by bewing »

I like files and processes, so I will play devil's advocate:

All Turing Universal Computers do the same thing. They take data, and either logically or physically manipulate it using "sets of rules", to produce a new set of data. The sets of rules are called programs; data and programs end up stored in some form of media.

Also, abstractions always necessarily add inefficiency. If you have a "good" abstraction, then the computational inefficiency introduced will be "worth it" from the programmer's or user's point of view. For example, programming languages have proven themselves to be more valuable than the inefficiencies that they introduce into the minimal representation of the program that would do the task.

There are some specific abstractions of files that are also worth their overhead. Word processing documents. Spreadsheets. But I have never seen any abstraction that will cover all files, that is worth the inefficiency. Users and programmers understand files. I have never seen an abstraction of files that people understand even better.

And my OS goes in exactly the opposite direction as mystran's. In my OS, it is expected that almost every single task on the entire machine is blocked at almost all times. Timeslices are only given to the very few that actually have something to do.

And as far as "hiding" word processing docs from an mp3 player go -- it really isn't that hard to implement a protection scheme based on the filetype, and some kind of ID code for the program (that the program cannot change). The OS can restrict any program to only seeing those files that the user/superuser has allowed that program to detect. It is a nearly identical concept to access protection based on a userID.
elfenix
Member
Member
Posts: 50
Joined: Sun Dec 02, 2007 1:24 pm
Libera.chat IRC: elfenix
Location: United States
Contact:

Post by elfenix »

I have never seen an abstraction of files that people understand even better.
Oh really?

I think you have.

I think you are using it right now.

See, when you get on the web you don't think "i'm going to grab the file on server osdev.org" - no, you just click on a bookmark, link, type in the URL - which isn't a 'file' but a name that pulls up a graphical interface to a set of files.

The vast majority of files on your computer, you never actually see as files. All the icons on your desktop for instance... Not really files...

To a regular user, would the technical means that a picture named "Baby's First Step" was a file? Or would it matter that they could access the picture named "Baby's First Step?" For all they know, the entire photo album on their computer is stored as one glob of data...

Now, on a technical level, you need some way to associate that glob of data on the disk with the name "Baby's First Step". But how that data is presented to the user? Or even a programmer? Well, that's a very different story.
User avatar
bewing
Member
Member
Posts: 1401
Joined: Wed Feb 07, 2007 1:45 pm
Location: Eugene, OR, US

Post by bewing »

You have a good argument, but I think I disagree with the essence. I'm not sure how many n00b technophobes you have ever introduced to computers ... but in my experience many of them pick up on the idea of "files" a lot faster than they pick up the idea of "nested hyperlinks". Of course, it takes a lot of time and patience to get to either one. :shock:
In fact, they tend to over-generalize "objects" as being files.


And on a windoze desktop, all those little icons are .lnk files. :wink:
jal
Member
Member
Posts: 1385
Joined: Wed Oct 31, 2007 9:09 am

Post by jal »

Fate wrote:Unless you want to only perform 1 task on a computer, you're going to have multitasking. Arguing against "threads/processes" abstraction in such an environment is like saying you want to eat a vegetarian beef steak....
A process or a thread is an abstraction, both technically and functionally. Technically, since you assign each thread/process a 'state', functionally, because they perform seperate logical tasks. However, although the technical abstraction has its merits (the Turing machine, having a set of registers, etc.), the functional abstraction needn't. For example, if you have a set of components, each being able to perform certain tasks, you can functionally see them as a set of parallel executed tasks (and in a multi processor system they may very well be), isolated from others except where they interact. The fact that the underlying technology is a continuous save/restore state, with timeslices and the like, is functionally not interesting. Also, the difference between a thread and a process (i.e. threads share their memory, processes are isolated from one another) is a technical thing, and from a functional point of view not interesting. One could implement all threads as processes (as the original Unix did), without any state-sharing, but this would yield a giant overhead as instead of checking some global variable, each object's state has to be queried.


JAL
Post Reply