Page 1 of 1

To thread or not to thread...

Posted: Thu May 29, 2008 4:58 pm
by iammisc
I'm developing a microkernel and I've finished the VFS server. As of right now, the vfs server's responsibility is to parse file paths, perform lookups and other relatively high-level stuff. It delegates the actual filesystem work to other filesystem servers. As of right now, the VFS is synchronous and single threaded meaning that another process can't read a file on a separate filesystem if another process's request is being served. I was just wondering if I should multithread the VFS server or not and I would like to now what other microkernels do and why? Are most microkernel VFS servers threaded or not?

Posted: Thu May 29, 2008 6:33 pm
by Colonel Kernel
I would guess that VFS servers for other microkernels would at least have a thread for each CPU core in the system. This is purely a hunch though -- I haven't looked into it yet.

Re: To thread or not to thread...

Posted: Thu May 29, 2008 7:39 pm
by Brendan
Hi,
iammisc wrote:I'm developing a microkernel and I've finished the VFS server. As of right now, the vfs server's responsibility is to parse file paths, perform lookups and other relatively high-level stuff. It delegates the actual filesystem work to other filesystem servers. As of right now, the VFS is synchronous and single threaded meaning that another process can't read a file on a separate filesystem if another process's request is being served. I was just wondering if I should multithread the VFS server or not and I would like to now what other microkernels do and why? Are most microkernel VFS servers threaded or not?
A VFS server should be asynchronous.

For each operation (open, close, read, write, etc) the VFS should build a "request structure", where the request structure has certain states. When the VFS receives a request it does anything necessary to put the request into the first state before going back to handling other requests. Sooner or later something will happen that allows the VFS to take the request to the next state (and then the next state, and the next state, etc), until eventually the request is completed.

For example, the request structure for the "open" operation might have 2 states:
  • - waiting for the corresponding file system to say if the file exists or not
    - waiting for the corresponding file system to create the file being opened
When the VFS receives a request to open a file it might build the request structure, find out which file system corresponds to the file being opened and send a "does this file exist already?" request to that file system (which leaves the "open" request in the first state); and then handle other requests while it waits for a response from the file system.

When the response from the file system arrives the VFS might be able to complete the operation if the file exists, or it might send a "create the new file" request to the file system (which leaves the "open" request in the second state); and then handle other requests while it waits for a response from the file system. When the next response from the file system arrives the VFS would be able to complete the operation (create a file handle, tell the thread that opened the file what happened, destroy the request structure, etc).

In this case, the VFS does relatively small amounts of work in-between relatively large delays (e.g. waiting for a new request or waiting for file systems to reply) and doesn't really need to be multi-threaded unless it's handling a huge number of requests at the same time. In fact (without very careful design) using multiple threads would probably make things worse due to extra overhead, lock contention, cache line bouncing, etc.

It also means that the VFS can prioritize file I/O by giving each request a priority and doing the most important request (so that an important file I/O operation isn't delayed by less important operations), and also means that you can cancel a request. For example, an application might assume it needs to load a file eventually (because the application frequently does need this file - e.g. do some pre-loading to improve performance for the "common case"), so it might ask the VFS to start loading the file as a low priority operation so that it's already loaded (or at least partially loaded) if/when it's needed, and if the application finds out that it won't need this file it can cancel the operation.

For a synchronous VFS, you'd need a seperate thread for each request/operation. The overhead of scheduling these threads would suck (task switching costs, etc) and under load you'd have problems with lock contention, etc. It'd also make it difficult to implementing prioritized file I/O and/or canceling operations.

Note: For performance, *when* something is done is often more important than *how* something is done (which is why most decent modern OSs do implement prioritized file I/O)...


Cheers,

Brendan

Posted: Thu May 29, 2008 10:47 pm
by iammisc
Thanks so much for your explanation Brendan. Keeping the VFS asynchronous seems to be a very good idea.

Posted: Tue Jun 03, 2008 4:36 am
by jal
Brendan wrote:For a synchronous VFS, you'd need a seperate thread for each request/operation
iammisc wrote:Keeping the VFS asynchronous seems to be a very good idea.
You both seem to have your terminology reversed? Synchronous implies single threaded, asynchronous implies multi threaded.
The overhead of scheduling these threads would suck (task switching costs, etc)
That depends, if you have a massive multi-threaded design already, the overhead is not that great (a few threads compared to a single one, as there's a gazillion threads already).


JAL

Posted: Tue Jun 03, 2008 4:59 am
by Korona
jal wrote:You both seem to have your terminology reversed?
Synchronous implies single threaded, asynchronous implies multi threaded.
Synchronous means in this case:
-> Get a request from an application
-> Process the request and return the result (The request is processed immediately thus synchronous)
When you have more than one application using the file system you will need multiple threads so that each request can be processed immediately. (Or at least as soon as the other requests are finished)

Asynchronous means:
-> Get a request from an application
-> Store that request in a queue. (The request is not processed synchronous; the application has to wait some time until the driver processes it.)
While the file system driver does:
-> Get next request from the queue
-> Process the request and return the result

Posted: Tue Jun 03, 2008 6:13 am
by jal
Korona wrote:
jal wrote:You both seem to have your terminology reversed?
Synchronous implies single threaded, asynchronous implies multi threaded.
Synchronous means in this case:
-> Get a request from an application
-> Process the request and return the result (The request is processed immediately thus synchronous)
When you have more than one application using the file system you will need multiple threads so that each request can be processed immediately. (Or at least as soon as the other requests are finished)

Asynchronous means:
-> Get a request from an application
-> Store that request in a queue. (The request is not processed synchronous; the application has to wait some time until the driver processes it.)
While the file system driver does:
-> Get next request from the queue
-> Process the request and return the result
Ah, ok, I can see that you could use it in reverse of what I was thinking. Thanx for clearing that up.


JAL

Posted: Tue Jun 03, 2008 8:36 pm
by iammisc
jal wrote: You both seem to have your terminology reversed? Synchronous implies single threaded, asynchronous implies multi threaded.
Just wanted to say that the word that you are looking for to describe multithreadedness is parallel and for a single thread, unithreaded. Synchronous means in time, while asynchronous means out of order or not in time. Think of linux's asynchronous io requests.