To thread or not to thread...
To thread or not to thread...
I'm developing a microkernel and I've finished the VFS server. As of right now, the vfs server's responsibility is to parse file paths, perform lookups and other relatively high-level stuff. It delegates the actual filesystem work to other filesystem servers. As of right now, the VFS is synchronous and single threaded meaning that another process can't read a file on a separate filesystem if another process's request is being served. I was just wondering if I should multithread the VFS server or not and I would like to now what other microkernels do and why? Are most microkernel VFS servers threaded or not?
- Colonel Kernel
- Member
- Posts: 1437
- Joined: Tue Oct 17, 2006 6:06 pm
- Location: Vancouver, BC, Canada
- Contact:
I would guess that VFS servers for other microkernels would at least have a thread for each CPU core in the system. This is purely a hunch though -- I haven't looked into it yet.
Top three reasons why my OS project died:
- Too much overtime at work
- Got married
- My brain got stuck in an infinite loop while trying to design the memory manager
Re: To thread or not to thread...
Hi,
For each operation (open, close, read, write, etc) the VFS should build a "request structure", where the request structure has certain states. When the VFS receives a request it does anything necessary to put the request into the first state before going back to handling other requests. Sooner or later something will happen that allows the VFS to take the request to the next state (and then the next state, and the next state, etc), until eventually the request is completed.
For example, the request structure for the "open" operation might have 2 states:
When the response from the file system arrives the VFS might be able to complete the operation if the file exists, or it might send a "create the new file" request to the file system (which leaves the "open" request in the second state); and then handle other requests while it waits for a response from the file system. When the next response from the file system arrives the VFS would be able to complete the operation (create a file handle, tell the thread that opened the file what happened, destroy the request structure, etc).
In this case, the VFS does relatively small amounts of work in-between relatively large delays (e.g. waiting for a new request or waiting for file systems to reply) and doesn't really need to be multi-threaded unless it's handling a huge number of requests at the same time. In fact (without very careful design) using multiple threads would probably make things worse due to extra overhead, lock contention, cache line bouncing, etc.
It also means that the VFS can prioritize file I/O by giving each request a priority and doing the most important request (so that an important file I/O operation isn't delayed by less important operations), and also means that you can cancel a request. For example, an application might assume it needs to load a file eventually (because the application frequently does need this file - e.g. do some pre-loading to improve performance for the "common case"), so it might ask the VFS to start loading the file as a low priority operation so that it's already loaded (or at least partially loaded) if/when it's needed, and if the application finds out that it won't need this file it can cancel the operation.
For a synchronous VFS, you'd need a seperate thread for each request/operation. The overhead of scheduling these threads would suck (task switching costs, etc) and under load you'd have problems with lock contention, etc. It'd also make it difficult to implementing prioritized file I/O and/or canceling operations.
Note: For performance, *when* something is done is often more important than *how* something is done (which is why most decent modern OSs do implement prioritized file I/O)...
Cheers,
Brendan
A VFS server should be asynchronous.iammisc wrote:I'm developing a microkernel and I've finished the VFS server. As of right now, the vfs server's responsibility is to parse file paths, perform lookups and other relatively high-level stuff. It delegates the actual filesystem work to other filesystem servers. As of right now, the VFS is synchronous and single threaded meaning that another process can't read a file on a separate filesystem if another process's request is being served. I was just wondering if I should multithread the VFS server or not and I would like to now what other microkernels do and why? Are most microkernel VFS servers threaded or not?
For each operation (open, close, read, write, etc) the VFS should build a "request structure", where the request structure has certain states. When the VFS receives a request it does anything necessary to put the request into the first state before going back to handling other requests. Sooner or later something will happen that allows the VFS to take the request to the next state (and then the next state, and the next state, etc), until eventually the request is completed.
For example, the request structure for the "open" operation might have 2 states:
- - waiting for the corresponding file system to say if the file exists or not
- waiting for the corresponding file system to create the file being opened
When the response from the file system arrives the VFS might be able to complete the operation if the file exists, or it might send a "create the new file" request to the file system (which leaves the "open" request in the second state); and then handle other requests while it waits for a response from the file system. When the next response from the file system arrives the VFS would be able to complete the operation (create a file handle, tell the thread that opened the file what happened, destroy the request structure, etc).
In this case, the VFS does relatively small amounts of work in-between relatively large delays (e.g. waiting for a new request or waiting for file systems to reply) and doesn't really need to be multi-threaded unless it's handling a huge number of requests at the same time. In fact (without very careful design) using multiple threads would probably make things worse due to extra overhead, lock contention, cache line bouncing, etc.
It also means that the VFS can prioritize file I/O by giving each request a priority and doing the most important request (so that an important file I/O operation isn't delayed by less important operations), and also means that you can cancel a request. For example, an application might assume it needs to load a file eventually (because the application frequently does need this file - e.g. do some pre-loading to improve performance for the "common case"), so it might ask the VFS to start loading the file as a low priority operation so that it's already loaded (or at least partially loaded) if/when it's needed, and if the application finds out that it won't need this file it can cancel the operation.
For a synchronous VFS, you'd need a seperate thread for each request/operation. The overhead of scheduling these threads would suck (task switching costs, etc) and under load you'd have problems with lock contention, etc. It'd also make it difficult to implementing prioritized file I/O and/or canceling operations.
Note: For performance, *when* something is done is often more important than *how* something is done (which is why most decent modern OSs do implement prioritized file I/O)...
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Brendan wrote:For a synchronous VFS, you'd need a seperate thread for each request/operation
You both seem to have your terminology reversed? Synchronous implies single threaded, asynchronous implies multi threaded.iammisc wrote:Keeping the VFS asynchronous seems to be a very good idea.
That depends, if you have a massive multi-threaded design already, the overhead is not that great (a few threads compared to a single one, as there's a gazillion threads already).The overhead of scheduling these threads would suck (task switching costs, etc)
JAL
Synchronous means in this case:jal wrote:You both seem to have your terminology reversed?
Synchronous implies single threaded, asynchronous implies multi threaded.
-> Get a request from an application
-> Process the request and return the result (The request is processed immediately thus synchronous)
When you have more than one application using the file system you will need multiple threads so that each request can be processed immediately. (Or at least as soon as the other requests are finished)
Asynchronous means:
-> Get a request from an application
-> Store that request in a queue. (The request is not processed synchronous; the application has to wait some time until the driver processes it.)
While the file system driver does:
-> Get next request from the queue
-> Process the request and return the result
Ah, ok, I can see that you could use it in reverse of what I was thinking. Thanx for clearing that up.Korona wrote:Synchronous means in this case:jal wrote:You both seem to have your terminology reversed?
Synchronous implies single threaded, asynchronous implies multi threaded.
-> Get a request from an application
-> Process the request and return the result (The request is processed immediately thus synchronous)
When you have more than one application using the file system you will need multiple threads so that each request can be processed immediately. (Or at least as soon as the other requests are finished)
Asynchronous means:
-> Get a request from an application
-> Store that request in a queue. (The request is not processed synchronous; the application has to wait some time until the driver processes it.)
While the file system driver does:
-> Get next request from the queue
-> Process the request and return the result
JAL
Just wanted to say that the word that you are looking for to describe multithreadedness is parallel and for a single thread, unithreaded. Synchronous means in time, while asynchronous means out of order or not in time. Think of linux's asynchronous io requests.jal wrote: You both seem to have your terminology reversed? Synchronous implies single threaded, asynchronous implies multi threaded.