Filesystem for Brendan's OS

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re:Filesystem for Brendan's OS

Post by Brendan »

Hi,
Candy wrote:You might use the simple&awkward method of shrinking the actual filesystem on the USB stick by X kbytes and stealing that part for your random bits. Then, abuse those bits for checking it's the same stick :).

Alternatively, you can store the bootup code on it, effectively making the computer unusable.... or for fun, store the password hash stuff on it. Does make the user vulnerable for losing the key though...
There's a place within the boot image (actually within the 16 bit setup code - 2nd stage boot) where the OS keeps default regional settings (so the OS can be used without the regional database if you like english), the boot menu state and a few other things. This area already contains the cluster name, so I'll probably add the cluster password and security level to it. The cluster name, cluster password and security level would be combined to form the cluster key.
Candy wrote:
Brendan wrote: For the medium and high settings I'll go the additional step of gathering/combining a collection of values from the hardware and using that as part of encrypting the key - I know it won't prevent anyone determined, but it can't hurt (it doesn't matter if the cluster's password needs to be re-entered if the hardware changes).
Can you explain to me how this gives an advantage to security?
Sure - it prevents curious teenagers from opening a boot image and reading the password with a text editor :).
Candy wrote:It effectively locks the user to this one computer. You just said something about being able to move the computer elsewhere in the cluster without problem, what if the computer itself dies, in a small part (say, the NIC) which is then replaced? Will the OS no longer boot?
The encrypted information in the boot image (cluster name, cluster password and security level) is used by the OS to form a "cluster key". The cluster key is used to decrypt the native file systems (and probably for encrypting/decrypting networking data between nodes), but has nothing to do with each user and doesn't lock the user to a specific computer. Every computer in the cluster has it's own copy of the data used to create the cluster key (so the cluster key is identical for each node, but never transferred via. networking).

Given that all computers within the cluster use the same cluster key, the configuration of the network can be completely changed (for e.g. a laptop can be plugged into a LAN in America or into a LAN in China without any difference), and storage devices containing native file systems can be relocated anywhere within the cluster (e.g. a hard drive can be plugged into a computer in the cluster).

If hardware is changed, a device dies or a new computer is added to the cluster then a user will need to re-enter the cluster's password. Storing this password on the computer would (usually) prevent the need for users to enter the password, such that most users will not need to know this password. Imagine a computer room at a University where computers are booted twice a day by students, or an office where a temporary secretary is filling in for someone else for a few weeks.

[continued]
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Candy
Member
Member
Posts: 3882
Joined: Tue Oct 17, 2006 11:33 pm
Location: Eindhoven

Re:Filesystem for Brendan's OS

Post by Candy »

Brendan wrote: This area already contains the cluster name, so I'll probably add the cluster password and security level to it. The cluster name, cluster password and security level would be combined to form the cluster key.

...

Sure - it prevents curious teenagers from opening a boot image and reading the password with a text editor :).

...

The encrypted information in the boot image (cluster name, cluster password and security level) is used by the OS to form a "cluster key". The cluster key is used to decrypt the native file systems (and probably for encrypting/decrypting networking data between nodes), but has nothing to do with each user and doesn't lock the user to a specific computer. Every computer in the cluster has it's own copy of the data used to create the cluster key (so the cluster key is identical for each node, but never transferred via. networking).
which part of NEVER EVER EVER EVER EVER EVER EVER storing the password didn't you see?

And/or how many EVER's do I need to add?

Don't store the password. The system is locked down for use by just one person or group of persons. Each has his/her own encryption stuff and must remember his/her own key(s). Don't store the password of anybody anywhere.

You can't read it off the disk since... YOU DON'T STORE IT! So don't store it!

Given that you don't store the password, you only need to. A. check the right one was given, B. use some form of it to decode the global password, C. use that for the rest of the stuff.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re:Filesystem for Brendan's OS

Post by Brendan »

[Continued from my last message]
Candy wrote:Hash functions are:
SHA (0, 1, 160, 256, 384, 512)
MD[4, 5]
RSA
Any common encryption function
I've only looked at SHA-512 so far... Which is the best for speed, and which is the best for security?
Candy wrote:Anyway, consider trying *FS ;)
I have some specific (and possibly unusual) requirements. I'm also unsure how much *FS will actually support as it's still mostly in draft form.

Specific requirements for my OS (those that may not be addressed by *FS) would be:
- sparse files (or files with gaping holes in them).
- lazy formatting, where the OS formats additional tracks when it needs to or during idle time (rather than doing all the formatting before the OS uses the file system).
- a "deprecated" flag that effects the entire partition. When this flag is set the OS tries very hard not to store new data on the partition, whilst moving files from the partition to anywhere else during idle time. This is to be used by my OS when a hard drive is being removed from the cluster entirely (to ensure no data is lost or recoverable).
- versioning. For each file the OS/file system maintains N versions of the data under the same path/directory name. *FS's meta-data doesn't quite suit for this purpose as some sectors may be the same in several versions of the data. For example, the file "foo.bar" might use sectors 1, 3 and 5 until a user appends some data to the end of it. Now the file "foo.bar" has the new version of the data in sectors 1, 3, 6 and 9, but the old version of the file's data can still be retrieved from sectors 1, 3 and 5. Then someone comes along and changes something at the start of the file which leaves 3 versions of the data - newest in sectors 11, 3, 6 and 9, the next oldest in 1, 3, 6 and 9, and the oldest in sectors 1, 3 and 5.
- no distinction between files and directories. Instead they are all just "file system entries", where any file system entry can contain more file system entries (like a directory) and/or file data (like a file). For example, the user can do something like: "echo "Hello there" > foo" and then "cd foo", "echo "Test" > bar".

My OS design distributes files and parts of files (at a higher level) rather than blocks (at a lower level), or to put it another way, the VFS is distributed while each native file system isn't. Further, all native file systems are super-imposed on top of each other such that it's impossible to determine which partition/s any file is stored on from it's path (unlike *nix where you can follow the mount points, which is faster but more restrictive). This raises concerns as to how the *FS "slices" concept would operate in this environment.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Candy
Member
Member
Posts: 3882
Joined: Tue Oct 17, 2006 11:33 pm
Location: Eindhoven

Re:Filesystem for Brendan's OS

Post by Candy »

Brendan wrote:
Candy wrote:Hash functions are:
SHA (0, 1, 160, 256, 384, 512)
MD[4, 5]
RSA
Any common encryption function
I've only looked at SHA-512 so far... Which is the best for speed, and which is the best for security?
For speed, use a real hash function. They've been designed to do just hashing and do that quickly.

For security, use a good encryption such as AES or Serpent in their biggest mode for pretty good security.
I have some specific (and possibly unusual) requirements. I'm also unsure how much *FS will actually support as it's still mostly in draft form.

Specific requirements for my OS (those that may not be addressed by *FS) would be:
- sparse files (or files with gaping holes in them).
Supported
- lazy formatting, where the OS formats additional tracks when it needs to or during idle time (rather than doing all the formatting before the OS uses the file system).
How do you imagine formatting works? Setting up the file system is pretty much a single shot operation of about 20-1000 milliseconds, adding more of the disk to it is similar to setting it up... what would you need lazy formatting for?
- a "deprecated" flag that effects the entire partition. When this flag is set the OS tries very hard not to store new data on the partition, whilst moving files from the partition to anywhere else during idle time. This is to be used by my OS when a hard drive is being removed from the cluster entirely (to ensure no data is lost or recoverable).
We don't have that yet but need it, for slices/disks to be removed from the computer system.
- versioning. For each file the OS/file system maintains N versions of the data under the same path/directory name. *FS's meta-data doesn't quite suit for this purpose as some sectors may be the same in several versions of the data. For example, the file "foo.bar" might use sectors 1, 3 and 5 until a user appends some data to the end of it. Now the file "foo.bar" has the new version of the data in sectors 1, 3, 6 and 9, but the old version of the file's data can still be retrieved from sectors 1, 3 and 5. Then someone comes along and changes something at the start of the file which leaves 3 versions of the data - newest in sectors 11, 3, 6 and 9, the next oldest in 1, 3, 6 and 9, and the oldest in sectors 1, 3 and 5.
Supported, not through metadata but through actual versioning. Using userlevel methodics even, you pretty much define how to do an incremental store and load and it uses your methods to do so. Choose your own diff per type of file.
- no distinction between files and directories. Instead they are all just "file system entries", where any file system entry can contain more file system entries (like a directory) and/or file data (like a file). For example, the user can do something like: "echo "Hello there" > foo" and then "cd foo", "echo "Test" > bar".
That's pretty nice in concept, but *requires* limited file name length or awkward directory file formats. We're still discussing it, but I consider most directories to be specific structures for the actual layout of the FS itself. For the cases where you want this behaviour you can use a normal file type and define a directory interface for it, after which the VFS can use your directory module for reading through it.
My OS design distributes files and parts of files (at a higher level) rather than blocks (at a lower level), or to put it another way, the VFS is distributed while each native file system isn't. Further, all native file systems are super-imposed on top of each other such that it's impossible to determine which partition/s any file is stored on from it's path (unlike *nix where you can follow the mount points, which is faster but more restrictive). This raises concerns as to how the *FS "slices" concept would operate in this environment.
In itself, this is a VFS layer problem. The two things that we probably can't reconcile:

1. Distributed. We haven't taken this as an idea and probably won't add it any time in the future in the level of the filesystem itself. We can recommend it in the VFS layer which isn't in our hands.
2. Superimposed. Same story, we think it should be in the VFS layer upon which we don't have any control.

I considered *FS as a replacement for the common filesystems to make the disk an orthogonal device again. You have a "disk" of a certain size and you can split it up, and the parts won't be gone until they're actually used, so you can pretty much try out anything you like and only see it fixed when it's full.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re:Filesystem for Brendan's OS

Post by Brendan »

Hi,
Candy wrote:Don't store the password. The system is locked down for use by just one person or group of persons. Each has his/her own encryption stuff and must remember his/her own key(s). Don't store the password of anybody anywhere.

You can't read it off the disk since... YOU DON'T STORE IT! So don't store it!

Given that you don't store the password, you only need to. A. check the right one was given, B. use some form of it to decode the global password, C. use that for the rest of the stuff.
Perhaps some details of what my OS does during boot might help to clarify things...

First the OS does all the normal things (getting into memory, device detection, etc). Then it loads and starts device drivers for the detected devices [problem #1], and then other code for things like file systems, networking protocols, servers/daemons (cron, http, ftp, samba, whatever) [problem #2].

Then, for each video card/keyboard pair the OS spawns a "user interface thread" which becomes an interface between applications and the user/s (for e.g. a computer with 2 video cards and 2 keyboards ends up with 2 seperate user interface threads, one for each user). The first thing each user interface thread does is create a login screen. When the user enters their username and password the user interface thread needs to check if the details correspond to anything within the "/etc/passwords" file [problems #3 and #4].

Problem #1 - the device drivers are stored on an encrypted native file system. The OS may need the "cluster key" before it can access the keyboard drivers, the video drivers, etc.

Problem #2 - the servers/daemons may be large and/or use a large amount of data. This data may be distributed across the cluster. A computer won't be able to convince other computers in te cluster that it's part of the same cluster without the "cluster key". To make this work the "cluster key" must not depend on which user (if any) happened to log in - any data sent/recieved via. network between computers within a cluster must be encrypted and decrypted using the cluster key, but this cluster key shouldn't be sent across the network first. Further, no-one might log in (this is how my Gentoo server works at least - turn it on and walk away and http, ftp, etc still works).

Problem #3 - the user interface thread needs to read the latest "/etc/passwords" file, but this file may have been changed during the night (when the computer being started was off). In this case the user interface thread asks the VFS for the data, the VFS looks around to figure out where the latest version is (ie. it asks each accessable node within the cluster which version it has). Eventually the VFS may end up retrieving the password file from a computer that's on the other side of the world, dragging it through a number of untrusted computers (the internet) on the way.

Problem #4 - the user is a guest who doesn't have any security clearance and aren't mentioned in the "/etc/passwords" file anywhere. The OS gives them minimal access (advertising brochures perhaps) but still allows them to log in.

Now imagine a University with computer rooms - the lecturer going around and typing the cluster password into around 30 computers while the students are standing there waiting to use them. How long would it take before one of the students decides to watch which keys the lecturer types in? IMHO even the students doing business studies would figure it out within 2 weeks (if the lecturer hasn't given up by then).

For clusters that need to be completely secure (and where secure hardware isn't used) forcing the cluster password to be entered before the OS will finish booting is appropriate, but in many cases it's not. Therefore, (IMHO) for these low security cases it may be more appropriate to store the cluster password on disk. How secure the cluster needs to be depends on how secure the data on it needs to be. The OS should also handle "dual boot" where one version is configured to allow the computer to become part of a secure cluster (cluster password manually entered) and the other version gives access to a less secure cluster (all file systems, etc would be segregated due to the different cluster keys).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Post Reply