Re: Single source file
Posted: Fri Jul 05, 2013 10:43 pm
Hi,
The main problem is that nothing is designed for this. For example, if you download 1234 files with FTP (using a utility like "wget") then you don't want to make the user manually enter the type of each of those files; and it's extremely difficult (and sometimes impossible) to reliably "auto-guess" the file type using heuristics. To do it well you've got 2 choices - invent replacements for things that mess up the file type (e.g. replace or extend protocols like FTP and HTTP, file systems like FAT and ISO9660, archive file formats like zip and tar, etc. so that each file's type is preserved) or make it extremely easy to reliably "auto-guess" what type files are (e.g. define a standard header that include file's type to avoid all guess-work). Sadly, if you really want to improve things sometimes you need to break compatibility with something.
Then next thing to consider is performance. If a file is often converted to a different type; it'd be really nice if the VFS would automatically cache the converted file in the file system and skip conversion. For example, native file systems could be designed to so that each file has an original file type and the original file's data, plus (none or more) extra types with extra file data. Of course if the original file is modified you'd discard any of the file's extra types with extra data (forcing any conversions to be done again if/when the original file is modified).
Now think about "single source file". You could have a networked file system (e.g. something like NFS) running on a server, and a file called "foo" containing the source code for an application. When computer #1 asks the server to read the file "foo" as a "32-bit 80x86 native application" file type, the server would use a "file format converter plug-in" to compile the source file (if it's not already cached) and computer #1 automatically gets a file it can execute natively; and when computer #2 on the network asks the server to read the same file "foo" as an "64-bit ARM native application" file type it also gets a file it can execute natively. An office could have a network of 25 disk-less computers all connected to the same file server and everything would automatically work seamlessly (even if all the disk-less computers have very different types of CPUs in them).
Of course compiling from "plain text" source code has a few problems - it's slower (e.g. tokenising, parsing, grammar checking and some pre-optimisation to convert the source into some form of "intermediate representation" for the back-end to work with) and creates problems for companies that want to publish applications as closed source. What you want is some sort of "portable byte-code" file format; then the VFS can have a "source -> byte-code converter" and "byte-code -> native executable converters" (and chain them together when needed). That way a company can provide "portable byte-code" instead of giving away their source code, and for people that do provide source code you'd avoid a reasonable amount of overhead if/when the source code is compiled for several different targets.
The only other thing we'd need to figure out is what such an amazingly advanced OS should be called. I suggest we call it "a mere sub-set of Brendan's Clustering Operating System" because this is only really part of my plan...
Cheers,
Brendan
If files had types (like variables in a high level language), then your VFS could do automatic type conversion (like casting a float to an int in a high level language). The VFS could use some sort of "file format converter plug-ins"; so that (e.g.) if an application wants to open a word processor file as a picture the VFS would convert the file and give the application a picture. Of course you'd want to allow the VFS to use multiple file format converter plug-ins - e.g. if you don't have a plug-in to convert a spreadsheet file directly into a picture file but you do have "spreadsheet file -> word processor file" and "word processor file -> picture" then with a little intelligence built into the VFS it still works.Combuster wrote:What occurred to me is that the "file" is basically untyped, just as any data access in assembly lacks types. We can do better than that by starting to make the entire filesystem mime-type aware in some fashion.
The main problem is that nothing is designed for this. For example, if you download 1234 files with FTP (using a utility like "wget") then you don't want to make the user manually enter the type of each of those files; and it's extremely difficult (and sometimes impossible) to reliably "auto-guess" the file type using heuristics. To do it well you've got 2 choices - invent replacements for things that mess up the file type (e.g. replace or extend protocols like FTP and HTTP, file systems like FAT and ISO9660, archive file formats like zip and tar, etc. so that each file's type is preserved) or make it extremely easy to reliably "auto-guess" what type files are (e.g. define a standard header that include file's type to avoid all guess-work). Sadly, if you really want to improve things sometimes you need to break compatibility with something.
Then next thing to consider is performance. If a file is often converted to a different type; it'd be really nice if the VFS would automatically cache the converted file in the file system and skip conversion. For example, native file systems could be designed to so that each file has an original file type and the original file's data, plus (none or more) extra types with extra file data. Of course if the original file is modified you'd discard any of the file's extra types with extra data (forcing any conversions to be done again if/when the original file is modified).
Now think about "single source file". You could have a networked file system (e.g. something like NFS) running on a server, and a file called "foo" containing the source code for an application. When computer #1 asks the server to read the file "foo" as a "32-bit 80x86 native application" file type, the server would use a "file format converter plug-in" to compile the source file (if it's not already cached) and computer #1 automatically gets a file it can execute natively; and when computer #2 on the network asks the server to read the same file "foo" as an "64-bit ARM native application" file type it also gets a file it can execute natively. An office could have a network of 25 disk-less computers all connected to the same file server and everything would automatically work seamlessly (even if all the disk-less computers have very different types of CPUs in them).
Of course compiling from "plain text" source code has a few problems - it's slower (e.g. tokenising, parsing, grammar checking and some pre-optimisation to convert the source into some form of "intermediate representation" for the back-end to work with) and creates problems for companies that want to publish applications as closed source. What you want is some sort of "portable byte-code" file format; then the VFS can have a "source -> byte-code converter" and "byte-code -> native executable converters" (and chain them together when needed). That way a company can provide "portable byte-code" instead of giving away their source code, and for people that do provide source code you'd avoid a reasonable amount of overhead if/when the source code is compiled for several different targets.
The only other thing we'd need to figure out is what such an amazingly advanced OS should be called. I suggest we call it "a mere sub-set of Brendan's Clustering Operating System" because this is only really part of my plan...
Cheers,
Brendan