Page 2 of 4
Posted: Wed Dec 19, 2007 4:15 am
by jal
bewing wrote:but in my experience many of them pick up on the idea of "files" a lot faster than they pick up the idea of "nested hyperlinks".
Yes, that's the trade-off between real-world metaphores and abstractions without a real-world counterpart. Still, a photograph is a photograph and a text document is a text document. That both, on a lower level, are one and the same entity is not clear (and should not be clear) to a novice computer user.
JAL
Posted: Thu Dec 20, 2007 11:50 pm
by elfenix
bewing wrote:You have a good argument, but I think I disagree with the essence. I'm not sure how many n00b technophobes you have ever introduced to computers ... but in my experience many of them pick up on the idea of "files" a lot faster than they pick up the idea of "nested hyperlinks". Of course, it takes a lot of time and patience to get to either one.
In fact, they tend to over-generalize "objects" as being files.
And on a windoze desktop, all those little icons
are .lnk files.
I taught highschool computer science before I grew a brain and decided to actually make money... (Ok, it was only a brief time, but enough to meet my fair share of n00bs.)
I think you're response here speaks more to my points. A n00b doesn't need to understand what a nested hyperlink is, or how it works to browse the web. They just click - and up comes a new page. A file requires you to have more knowledge of the abstract idea going on. Thus the web navigation is a lot more powerful.
Touche' on Windows.
Posted: Fri Dec 21, 2007 1:32 am
by AndrewAPrice
I posted somewhere else about my idea of file not being so linear. E.g. Files have a way of storing a name, permissions, dates, etc, but in traditional file systems these just wrap around a 'main' stream of data.
My idea is that there is no 'main' stream of data. A file will be a collection of streams.
For example, an image file could contain the following streams:
Name: name of the file
Date Created
Date Modified
Permissions
Image Compressor: e.g. JPEG
Image Quality: e.g. 60%
Image Width: e.g. 1024
Image Height: e.g. 768
Image Description: e.g. A picture of a house.
JPG Stream: Raw JPEG stream goes here..
You could have type inheritance. E.g. A base file type (with holds the name, permissions, dates, etc). A picture file inherits from the base type and then it stores the with, height, descriptin, etc. And JPG inherits from picture.
Posted: Fri Dec 21, 2007 11:44 am
by mystran
MessiahAndrw wrote:
For example, an image file could contain the following streams:
Name: name of the file
Date Created
Date Modified
Permissions
Image Compressor: e.g. JPEG
Image Quality: e.g. 60%
Image Width: e.g. 1024
Image Height: e.g. 768
Image Description: e.g. A picture of a house.
JPG Stream: Raw JPEG stream goes here..
I think that's the type of "metadata in the small" that people always concentrate and that's more or less useless. Things like image sizes, even the creator of the file, or whatever, are best stored into the file data itself. Most file formats have headers for all that info anyway, and it works fine, and you don't need to parse the whole file anyway, just grab the header and see that's in there. This kinda of data applies to all copies of the file. Sure, there might be copies with more/less data around, but generally speaking if you copy a file somewhere else, this kind of metadata is still valid.
Things like name of the file, permissions, dates... these are things that are relevant to this particular copy of the file. There's tons of reasons to have a the same file in different places with different name or permissions, so they aren't really part of the file itself, rather just something that describe this particular copy. If you wanna add something here, add MIME type. It's really part of the file itself (all the copies), but how do you tell the type if you don't know how to read the file?
The type of metadata that I'd add if I wanted to mess with file storage concepts, is relations between files. Say, it would be nice to have the filesystem know that 'dog94t.jpg' is a thumbnail of picture 'dog94.jpg', or that looking at 'ontherun.mp3' the next track is 'time.mp3' and the previous track is 'speaktomebreath.mp3'. Apps can do that kind of stuff right now, but they'll have to maintain their own indexes for that. Or how about more dates? What if I have 'projects2007-11.xsl' and I know it's missing something, and I'd like a list of all files that I modified November 2007, whether or not I've also modified them afterwards. Actually, keep the old versions as well, so I can look at what those files looked like on December the 1st, but let me search the web pages (full text please, don't I don't need those from google.com, thanks) I browsed last month, 'cos I solved a problem and I need to do it again but forgot the solution. Or howabout a listing of all files that were installed by the same program that installed 'foobar.exe'? All the files created by 'foobar.exe'?
That kind of stuff can be useful in practice, but unfortunately aren't exactly trivial to do..
[edit] Oh and caching stuff from the file itself (like some of what you listed) is waste of effort if you store it where the file is. Better store it into a separate index, so you can search by that stuff. When you have the file already, you can just as well look into it's internal headers.
Posted: Fri Dec 21, 2007 4:29 pm
by Ready4Dis
jal wrote:mystran wrote:It means for example, that you can't complete an "open()" call synchronously, because it might have to go to disk (or worse, network) in order to resolve the filename you gave it. So instead of returning the descriptor (or error code) as a return value of the request, the same info has to be sent as an asynchronous event afterwards.
I've been thinking along the same ways. But if you are, say, an MP3 player, and you've been just given the command to play an MP3 file, there's little use in not waiting for the result of the Open call. Making the Open call always asynchronous (and not wrapping it up in some higher level abstraction implementation making it synchronous) doesn't make programming life easier, I think.
JAL
Jal, imagine this: You have an application that wants to open an MP3 then play it once it's found/open and ready to be read. You create a callback function or similar (message parameter/value) so your program is notified when it is ready, rather than having a loop keep polling to see if it's ready, or having the application wasting clock cycles while it's waiting on something. It would simply exit and not do anything until it receives the message/callback. Basically, you turn open() into something that isn't blocking (similar to asynchronous sockets). So, rather than calling open and stalling (pretty much the same thing), you send the request to the kernel to open the file and wait for a response. The benefit is, if the user wants to cancel this action because something hung up, or the file is huge and not loading fast enough, whatever the case is, you could simply click cancel, notify the kernel you want to cancel your open request, and go along with your merry life. If you had a typical open() command, your program is sitting there waiting not processing anything (although there are ways around this, there is no good way to break out of the open command until it completes). So, turning something as simple as open() into an asynchronous event means the user can interact a bit more easily with the application, now thing about something more complex like read(), or opening a device like the cd-rom which has to spin up and could take a bit. Sure most of these events are handled by the kernel to be non-blocking, but the application is still using it as blocking and getting non of the benefits.
Posted: Fri Dec 21, 2007 7:07 pm
by mystran
Exactly. If you have an open() call pending, it does no good to require another GUI thread in order to be able to cancel the request. You either get the complexity by requiring another thread for it, or by making it work with a callback, and I believe that it's better to make the ABI work by messages that libraries can turn into callbacks, such that it's possible to build either single- or multi-threaded solutions as desired at application level, instead of the current state of things on major operating systems that is to basically require a thread that will wait until some arbitrary timeout, and if user cancels the whole thing you'll have to live with the fact that you have a thread that will (hopefully) eventually timeout, and go on with other stuff hoping that the whole mess will clean it up after itself.
It's not like it's hard anyway to kludge a blocking open() like call on top of a callback system... but I don't think that's the way one should design modern applications.
And it's not like it's especially hard to deal with callbacks either. Even with plain C it's rather simple (if somewhat tedious) thing to do. C++ makes it a lot easier by allowing objects to handle the callbacks as just another method. Anything with closures makes it almost trivial. For functional languages one doesn't really even need to care. Sure you might have to think of how you read your files.. but it's not rocket science.
Posted: Fri Dec 21, 2007 8:25 pm
by AndrewAPrice
My OS passes a struct when requesting VFS, FS and IO calls. The struct contains the request information like the type of request (read, write, rename, create, etc), buffer, length, offset, etc. The struct also contains a bool: WakeThreadOnCompletion, which wakes the thread sending the request upon completing the action so that you can send a thread to sleep to save some CPU during a big request. There is also a Status member. If this is 0, then the request is not yet complete. If this is >0 the request was successful, <0 the request failed (the exact value of Status says why the request failed).
My end user library wraps all this up into neat blocking/non-blocking IO requests.
Posted: Fri Dec 21, 2007 8:36 pm
by Ready4Dis
MessiahAndrw wrote:My OS passes a struct when requesting VFS, FS and IO calls. The struct contains the request information like the type of request (read, write, rename, create, etc), buffer, length, offset, etc. The struct also contains a bool: WakeThreadOnCompletion, which wakes the thread sending the request upon completing the action so that you can send a thread to sleep to save some CPU during a big request. There is also a Status member. If this is 0, then the request is not yet complete. If this is >0 the request was successful, <0 the request failed (the exact value of Status says why the request failed).
My end user library wraps all this up into neat blocking/non-blocking IO requests.
Yes, but the point was that you don't put the thread to sleep, if the thread is sleeping, you can't click a cancel button, or cancel the request in the middle of a transfer, and constantly checking a status member sounds like a lot of wasted CPU cycles. I'm not saying I 100% agree with it, but I do understand what he was talking about. It would remove the need to poll a variable, or put threads to sleep and not be able to cancel a request in the middle of an action. I know I can't stand hitting a cancel button and waiting 3 minutes for it to acknowledge that I hit it! Nor do I like when the cancel button is greyed out because the operation can't be stopped in the middle. Now, if there was a nice setup where my application was still receiving input, and could continue running a loop 'if' it needed/wanted to (or it could put itself to sleep and wait for the input), then it would solve a lot of problems (and cause a few more of course, like what to do if the process sends a cancel request while the kernel/action requested is trying to give the process it's answer, may have to have a lock somewhere or similar so they both can't do that at the same time). Anyways, it's just food for thought, sounds like a reasonable idea, only way to test it is for someone to implement and see how if it works. I know I would prefer having more control over the flow and stalls in my applications, so while i'm say waiting for the kernel to load a file, i can sit there and run my loading screen loop without having to make a system call and poll for a result (making tons of wasted context switches). There are a ton of times/reasons you could use this, it's not limited to file i/o operations, it can be useful for many other reasons (networking is a big one).
Posted: Fri Dec 21, 2007 9:26 pm
by AndrewAPrice
Ready4Dis wrote:Yes, but the point was that you don't put the thread to sleep, if the thread is sleeping, you can't click a cancel button, or cancel the request in the middle of a transfer, and constantly checking a status member sounds like a lot of wasted CPU cycles. I'm not saying I 100% agree with it, but I do understand what he was talking about. It would remove the need to poll a variable, or put threads to sleep and not be able to cancel a request in the middle of an action. I know I can't stand hitting a cancel button and waiting 3 minutes for it to acknowledge that I hit it! Nor do I like when the cancel button is greyed out because the operation can't be stopped in the middle. Now, if there was a nice setup where my application was still receiving input, and could continue running a loop 'if' it needed/wanted to (or it could put itself to sleep and wait for the input), then it would solve a lot of problems (and cause a few more of course, like what to do if the process sends a cancel request while the kernel/action requested is trying to give the process it's answer, may have to have a lock somewhere or similar so they both can't do that at the same time). Anyways, it's just food for thought, sounds like a reasonable idea, only way to test it is for someone to implement and see how if it works. I know I would prefer having more control over the flow and stalls in my applications, so while i'm say waiting for the kernel to load a file, i can sit there and run my loading screen loop without having to make a system call and poll for a result (making tons of wasted context switches). There are a ton of times/reasons you could use this, it's not limited to file i/o operations, it can be useful for many other reasons (networking is a big one).
You don't have to constantly check it, and besides, if I wanted a callback interface I could implement it in my library.
Posted: Sat Dec 22, 2007 6:15 am
by Ready4Dis
MessiahAndrw wrote:Ready4Dis wrote:Yes, but the point was that you don't put the thread to sleep, if the thread is sleeping, you can't click a cancel button, or cancel the request in the middle of a transfer, and constantly checking a status member sounds like a lot of wasted CPU cycles. I'm not saying I 100% agree with it, but I do understand what he was talking about. It would remove the need to poll a variable, or put threads to sleep and not be able to cancel a request in the middle of an action. I know I can't stand hitting a cancel button and waiting 3 minutes for it to acknowledge that I hit it! Nor do I like when the cancel button is greyed out because the operation can't be stopped in the middle. Now, if there was a nice setup where my application was still receiving input, and could continue running a loop 'if' it needed/wanted to (or it could put itself to sleep and wait for the input), then it would solve a lot of problems (and cause a few more of course, like what to do if the process sends a cancel request while the kernel/action requested is trying to give the process it's answer, may have to have a lock somewhere or similar so they both can't do that at the same time). Anyways, it's just food for thought, sounds like a reasonable idea, only way to test it is for someone to implement and see how if it works. I know I would prefer having more control over the flow and stalls in my applications, so while i'm say waiting for the kernel to load a file, i can sit there and run my loading screen loop without having to make a system call and poll for a result (making tons of wasted context switches). There are a ton of times/reasons you could use this, it's not limited to file i/o operations, it can be useful for many other reasons (networking is a big one).
You don't have to constantly check it, and besides, if I wanted a callback interface I could implement it in my library.
You do if you want to do something while waiting, unless you have a way to notify the application it is completed, that was the point. I am not saying whether it's a better idea than current or not, just trying to help clarify reasons for doing. There are plenty of downsides to doing it in a purely asyncronous way, for example, each call to the kernel will have to spawn a new thread so your program can continue while the kernel is processing your request (unless you have a kernel task that can only do one thing at a time with a message que of sorts, which can run into it's own problems with sorting priorities, things that rely on each other to happen, etc). It doesn't HAVE to be a callback interface, that was just an example of how to implement, but it's more efficient to let the kernel/driver handle this at a MUCH lower level than the application to minimize wasted cylces in certain cases. The general point was, that functions would be non-blocking for the most part, allowing your application to do whatever it wants while waiting, without constantly checking status. It's pretty much the same idea as asyncronous sockets in windows, rather than using the blocking calls, or starting a new thread so you aren't blocked in your main thread, you just wait for windows to send you a message saying data has arrived, done sending, connected, disconnected, etc and handle them when they happen without constantly checking if something happened. In your method, it could work similar, user presses open, it searches for file, user gets bored clicks cancel, your app receives the event from wherever and process it, sends a cancel request or simply ignores anything returned afterwards, that is good and works in a similar fashion (which was the original idea). The implementations don't have to be identical, but can your application put itself to sleep until the operation is completed? This is another benefit of not having it in the application level, wake-up on completion.
Posted: Wed Dec 26, 2007 1:59 am
by iammisc
I haven't read all the above posts so this might be off-topic and if it is, kindly ignore it and head on.
Anyway, something that I was thinking of instead of processes per se is a type of highly componentized design. Basically, a component can either convert data, display data, or edit data. Each component can support one format only. However, even though it only supports one format, it can still read other formats because there are other components that can convert between these formats. Notice, here that data is an abstract entity. IMO, data should be kept with the application that first wrote it like an embedded machine. But in this design, components, which take the place of applications, can be combined together by the user and so they can share data if the user allows it.
There are probably many problems in this design which I haven't thought of yet. But then again, that is why I'm posting it here. So tell me what you guys think.
I'll be waiting.
Posted: Thu Dec 27, 2007 12:24 pm
by unpredictable
Yeah, well most people have expressed concern over removing the 'threads/processes' idea, which is so so fundamental to the multi-tasking OSes. Some of you guys have suggested single tasking systems like DOS and embedded OSes.
I am really not talking about doing away with 'multi-tasking' as such. But I am talking about having different types of entities than processes. Mostly, processes are like a machine code, linked to other shared libs, executing in their own space.
Yeah, there'll be machine code and yeah, it'll preempt on events, if thats what you call by preserving the idea of 'processes'. But I want to change the idea of a process = a executable file. Instead, a process could be multiple machine-code components, talking to each other and processing data. Something similar to what iammisc mentioned.
Though I don't fully agree with his idea either, and I'm still formulating mine, but my question really is "Can we do away with the idea of processes as single executables and have some other way of doing things in a multitasking environment?"
Regarding files, I'm adamant on the idea of having 'namespaces' for applications as jal mentioned. The biggest problem with this design is, which application will see what? With the outburst of information, its really not possible for me to manually apply settings to each file on my 320 + 80 GB disks. Hence, namespaces goes for a toss. I'm still looking for a better idea, and I'd like opinion from the more experienced people regarding doing away with 'directory structures' and maybe one day even 'Files'.
Regards,
Posted: Tue Jan 08, 2008 3:41 am
by Ready4Dis
unpredictable wrote:
Though I don't fully agree with his idea either, and I'm still formulating mine, but my question really is "Can we do away with the idea of processes as single executables and have some other way of doing things in a multitasking environment?"
Nothing says a process needs to be a single executeable, the way I think about a process is a task that is running in memory. There is no reason why you couldn't say load 5 files from disk, and link them in memory and run them. There is plenty of this program requires this library to run because it has it's functions, it's very common in game programming to put your game engine in a .dll and your game in the executeable, even though it's still one 'process' it is seperated (and sometimes they break it down really far and provide .dll interfaces for input, sound, graphics, all of which can be replaced with whatever since the interface is the concern, so you could use directx for sound, then change to openal, or d3d for graphics and replace it with a software rasterizer or gl renderer. A process need not be a single executeable, but a group of them to provide the necessary functions.
Posted: Sat Jan 12, 2008 2:12 pm
by SpooK
The real question is, can you provide an abstraction that is better/faster/lighter than the concept of processes and threads without reducing functionality, stability or programmability???
Posted: Sun Jan 13, 2008 12:54 am
by AndrewAPrice
There was another thread which discussed is threads were the best abstraction of concurrency. I believe yes because his thread mostly described scheduling (how he had to keep two separate lists of threads, etc) and not the concept of a thread in itself (multiple 'threads' of execution existing within the same memory space or process).
The "no files" reminded me of an idea where I thought a single file or program could take up a whole disk with no file system (it was loaded directly from offset 0).