Microkernel IPC with arbitrary amount of transferred data
Microkernel IPC with arbitrary amount of transferred data
I am in the process of designing an implementing the IPC mechanism of my hobby microkernel project and I need some help mostly regarding design considerations.
I faced the first problem when I tried to implement the communication between an usual userspace process and the VFS server because I need to pass a lot of data between them when the process wants to read or write some kind of file, etc. At first I implemented my IPC messages with fixed size (6 * 8 bytes) to avoid queueing big message data inside the kernel. I thought that this problem can be solved with the help of creating a shared memory region per (to stick with the previous example) file descriptor. This region could be used to put the data there that should be written and vice versa. I started to worry about my design when I thought about what happens when multiple threads are trying to access the same file at the same time, the simple shared memory region will not be enough.
I have some ideas how to solve the issue but before trying to implement another method I would ask you for your opinions.
Idea #1: One shared memory region could be used per thread in the user application. This way it would not be a problem if multiple threads would like to write the same file at the same time. However this method requires a lot of support in the C (or whatever) library to maintain these informations. One more bottleneck that it still requires a copying when the data is put from the given user buffer into the shared region.
Idea #2: Extend the IPC messages to be able to contain any kind of data that will be cached by the kernel until the other end of the IPC link reads it. This seems to be the worst solution for me. It requires allocation of dynamic memory regions inside the kernel while I tried to avoid that.
Idea #3: Pass a pointer with a size to the ipc_send() function containing the memory location associated to the message from the sender process (e.g. the buffer to write into a file). Before receiving the message at the other end of the link the kernel maps the selected memory region to the address space of the other process. For now this seems to be the best solution I could think of however it still has some problem. I am worrying because the buffers passed to read() and write() are not page aligned in 99% of the cases. This means that the receiver process could access some of the sender's memory that is not even involved in the current operation.
What other options do I have to solve this problem?
I faced the first problem when I tried to implement the communication between an usual userspace process and the VFS server because I need to pass a lot of data between them when the process wants to read or write some kind of file, etc. At first I implemented my IPC messages with fixed size (6 * 8 bytes) to avoid queueing big message data inside the kernel. I thought that this problem can be solved with the help of creating a shared memory region per (to stick with the previous example) file descriptor. This region could be used to put the data there that should be written and vice versa. I started to worry about my design when I thought about what happens when multiple threads are trying to access the same file at the same time, the simple shared memory region will not be enough.
I have some ideas how to solve the issue but before trying to implement another method I would ask you for your opinions.
Idea #1: One shared memory region could be used per thread in the user application. This way it would not be a problem if multiple threads would like to write the same file at the same time. However this method requires a lot of support in the C (or whatever) library to maintain these informations. One more bottleneck that it still requires a copying when the data is put from the given user buffer into the shared region.
Idea #2: Extend the IPC messages to be able to contain any kind of data that will be cached by the kernel until the other end of the IPC link reads it. This seems to be the worst solution for me. It requires allocation of dynamic memory regions inside the kernel while I tried to avoid that.
Idea #3: Pass a pointer with a size to the ipc_send() function containing the memory location associated to the message from the sender process (e.g. the buffer to write into a file). Before receiving the message at the other end of the link the kernel maps the selected memory region to the address space of the other process. For now this seems to be the best solution I could think of however it still has some problem. I am worrying because the buffers passed to read() and write() are not page aligned in 99% of the cases. This means that the receiver process could access some of the sender's memory that is not even involved in the current operation.
What other options do I have to solve this problem?
-
- Member
- Posts: 595
- Joined: Mon Jul 05, 2010 4:15 pm
Re: Microkernel IPC with arbitrary amount of transferred dat
There are many ways to solve this.
Synchronous messages works fine here as long as both sender and receiver knows how much data will be coming before hand. The receiver buffer can reside anywhere in user space and do not require any particular alignment, also synchronous messages do not need any kind of buffering in the kernel. This method is however slow compared to giving physical pages directly. For real speed you should look at L4 kernel map and grant way to do it, basically giving a page from one process to another.
I don't recommend special shared memory areas because it is just an unnecessary complication and you will need to copy it to the location anyway.
For memory mapped files: use map/grant way.
For reading file to anywhere in the user memory: use synchronous messages of arbitrary length.
In the file system hierarchy, you use map/grant almost exclusively except in the last step to the user process that reads the file.
Synchronous messages works fine here as long as both sender and receiver knows how much data will be coming before hand. The receiver buffer can reside anywhere in user space and do not require any particular alignment, also synchronous messages do not need any kind of buffering in the kernel. This method is however slow compared to giving physical pages directly. For real speed you should look at L4 kernel map and grant way to do it, basically giving a page from one process to another.
I don't recommend special shared memory areas because it is just an unnecessary complication and you will need to copy it to the location anyway.
For memory mapped files: use map/grant way.
For reading file to anywhere in the user memory: use synchronous messages of arbitrary length.
In the file system hierarchy, you use map/grant almost exclusively except in the last step to the user process that reads the file.
Re: Microkernel IPC with arbitrary amount of transferred dat
I worked with a message passing kernel once that had a mechanism for calling the code in another process space in your own context. This mechanism was used whenever sending your data as signal payload was too time consuming. The company that implemented the kernel knew that sending messages wasn't always going to be the best solution. I would say don't get locked into a design that is uniform just for uniformity's sake.
Every universe of discourse has its logical structure --- S. K. Langer.
Re: Microkernel IPC with arbitrary amount of transferred dat
Two questions here...OSwhatever wrote:Synchronous messages works fine here as long as both sender and receiver knows how much data will be coming before hand. The receiver buffer can reside anywhere in user space and do not require any particular alignment, also synchronous messages do not need any kind of buffering in the kernel. This method is however slow compared to giving physical pages directly. For real speed you should look at L4 kernel map and grant way to do it, basically giving a page from one process to another.
At the receiver side I should know somehow the size of the arbitrary data before trying to receive the message to give a proper sized buffer to the kernel. Unfortunately without asking for the size of the message I could not really know it however this would require one more system call beside the receiver one. One more option would be to define the absolute maximum size of these messages (let's say 1Mb) and the sender should not send more than this at a time. This way the receiver can have a buffer that is suitable for receiving any kind of message. Any other way to solve this issue?
Let's say at first I would only implement synchronous IPC calls for communication between different processes in the system. Using this synchronous IPC for the whole communication between app<->VFS and VFS<->FS would lead into a problem in case of there is only one thread handling the requests inside the VFS because while serving the request of app1 is waiting for the underlying FS driver to complete the operation would prevent a concurrent request from app2 to get executed. How could I solve this? Use multiple threads to process requests in a server? ... but how many?
Thank you for this tip, I will look into the details of it.OSwhatever wrote:For memory mapped files: use map/grant way.
For reading file to anywhere in the user memory: use synchronous messages of arbitrary length.
In the file system hierarchy, you use map/grant almost exclusively except in the last step to the user process that reads the file.
Am I right if I think that this could be used with asyncronous communication in VFS, FS drivers, etc. and it would also solve the second issue I mentioned before?
Re: Microkernel IPC with arbitrary amount of transferred dat
giszo wrote: At the receiver side I should know somehow the size of the arbitrary data before trying to receive the message to give a proper sized buffer to the kernel. Unfortunately without asking for the size of the message I could not really know it however this would require one more system call beside the receiver one. One more option would be to define the absolute maximum size of these messages (let's say 1Mb) and the sender should not send more than this at a time. This way the receiver can have a buffer that is suitable for receiving any kind of message. Any other way to solve this issue?
Code: Select all
struct OS_signal
{
unsigned int size;
unsigned int signal_number;
char payload[1];
}
struct OS_signal *
signal_alloc(unsigned int size, unsigned int signal_number)
{
struct OS_signal *siggie;
siggie = malloc(sizeof(struct OS_signal) + size - 1);
if(!siggie)
{
log_error();
return NULL;
}
siggie->signal_number = signal_number;
siggie->size = size;
return siggie;
}
Use non blocking send. Use a signal queue. Block senders on queue full. Provide blocking receive and receive_with_timeout.giszo wrote: Let's say at first I would only implement synchronous IPC calls for communication between different processes in the system. Using this synchronous IPC for the whole communication between app<->VFS and VFS<->FS would lead into a problem in case of there is only one thread handling the requests inside the VFS because while serving the request of app1 is waiting for the underlying FS driver to complete the operation would prevent a concurrent request from app2 to get executed. How could I solve this? Use multiple threads to process requests in a server? ... but how many?
EDIT: For your signalling you want to create diagrammes like you see in this document: ITUT Interworking for SS7
Last edited by bwat on Tue Dec 03, 2013 3:03 pm, edited 3 times in total.
Every universe of discourse has its logical structure --- S. K. Langer.
-
- Member
- Posts: 595
- Joined: Mon Jul 05, 2010 4:15 pm
Re: Microkernel IPC with arbitrary amount of transferred dat
Usually, each file read will give you any length equal or less the read length you passed as parameter and in that way you can prevent any overflow. My kernel will pass the size of message that arrives and that way the user program will know it too.giszo wrote:Two questions here...
At the receiver side I should know somehow the size of the arbitrary data before trying to receive the message to give a proper sized buffer to the kernel. Unfortunately without asking for the size of the message I could not really know it however this would require one more system call beside the receiver one. One more option would be to define the absolute maximum size of these messages (let's say 1Mb) and the sender should not send more than this at a time. This way the receiver can have a buffer that is suitable for receiving any kind of message. Any other way to solve this issue?
Let's say at first I would only implement synchronous IPC calls for communication between different processes in the system. Using this synchronous IPC for the whole communication between app<->VFS and VFS<->FS would lead into a problem in case of there is only one thread handling the requests inside the VFS because while serving the request of app1 is waiting for the underlying FS driver to complete the operation would prevent a concurrent request from app2 to get executed. How could I solve this? Use multiple threads to process requests in a server? ... but how many?
Yes, if you have only one thread serving user application with file system operations using synchronous messages, it is likely to only be serving one client at a time. The solution is often to use multi threaded file services. You can use one thread if you want but it would likely to be an implementation mess with a horrible state machine. My Vfs creates a new thread for every file system request for example.
There is nothing wrong with asynchronous messages but it is usually bad for sending large amount of data. Often asynchronous messages must end up in a special message pool which later requires you to copy the data to its final destination but there are many ways to implement asynchronous messaging. Asynchronous data does not necessarily solve you concurrency problems either.Thank you for this tip, I will look into the details of it.
Am I right if I think that this could be used with asyncronous communication in VFS, FS drivers, etc. and it would also solve the second issue I mentioned before?
Re: Microkernel IPC with arbitrary amount of transferred dat
That is fine, my problem is the other way around when the user process want to write to a file. It passes a buffer with its size to the appropriate IPC method to send the request. The receiver (VFS) should provide a buffer large enough to store the sent data, however it does not really have an idea about the size of the sent message.OSwhatever wrote:Usually, each file read will give you any length equal or less the read length you passed as parameter and in that way you can prevent any overflow. My kernel will pass the size of message that arrives and that way the user program will know it too.
-
- Member
- Posts: 595
- Joined: Mon Jul 05, 2010 4:15 pm
Re: Microkernel IPC with arbitrary amount of transferred dat
Then you can split up the data in several IPC transfers. The VFS server and its interface both knows the maximum receive buffer in the VFS and can split up the transfer accordingly. You get the penalty of having several IPC messages and task switches but if the buffer is large enough it shouldn't be too bad. The file system itself usually have some kind buffer caching which also determines a maximum size which you can receive at a time.giszo wrote:That is fine, my problem is the other way around when the user process want to write to a file. It passes a buffer with its size to the appropriate IPC method to send the request. The receiver (VFS) should provide a buffer large enough to store the sent data, however it does not really have an idea about the size of the sent message.
- NickJohnson
- Member
- Posts: 1249
- Joined: Tue Mar 24, 2009 8:11 pm
- Location: Sunnyvale, California
Re: Microkernel IPC with arbitrary amount of transferred dat
The way I got around this in Rhombus was to allow IPC messages to reference a sequence of pages for bulk data. When a message was sent, those pages were unmapped from the sending process. When the message was received by the other process, it was granted the ability to map those pages into its own address space. This gives you the performance of shared memory (minus the overhead of the additional mapping call) while enforcing the message passing synchronization, and not filling up kernelspace with tons of copied data.
Re: Microkernel IPC with arbitrary amount of transferred dat
I would have one more topic that is not strongly connected to the original subject of the topic however we already touched it before, so I'm asking here. What would be a good way for implementing request handling in system services like the VFS server?
If I want to use the same synchronous IPC method I am going to use between apps and the VFS it would require one thread per request in the VFS server. OSwhatever mentioned before he avoided this problem by starting a new thread for each request. Is this really a good way?
The other way I could think of is to use asynchronous IPC with some kind of method like NickJohnson pointed out to pass memory pages between user processes. This way I could avoid storing message data inside the kernel when messages are queued at an IPC port because the data is already stored by the user process. However I am still worried because this method still requires a lot of support in the servers to be able to keep track of pending requests.
If I want to use the same synchronous IPC method I am going to use between apps and the VFS it would require one thread per request in the VFS server. OSwhatever mentioned before he avoided this problem by starting a new thread for each request. Is this really a good way?
The other way I could think of is to use asynchronous IPC with some kind of method like NickJohnson pointed out to pass memory pages between user processes. This way I could avoid storing message data inside the kernel when messages are queued at an IPC port because the data is already stored by the user process. However I am still worried because this method still requires a lot of support in the servers to be able to keep track of pending requests.
-
- Member
- Posts: 595
- Joined: Mon Jul 05, 2010 4:15 pm
Re: Microkernel IPC with arbitrary amount of transferred dat
Storing message data for large messages inside the kernel can be avoided for any kind of message system. You can copy message data inside the kernel from one process to another directly. There are optimizations where messages are temporarily stored in kernel because mapping parts of another process in the kernel space pollutes the TLB. This is mostly used for small messages. This is also known as the "double copy" in QNX.giszo wrote:I would have one more topic that is not strongly connected to the original subject of the topic however we already touched it before, so I'm asking here. What would be a good way for implementing request handling in system services like the VFS server?
If I want to use the same synchronous IPC method I am going to use between apps and the VFS it would require one thread per request in the VFS server. OSwhatever mentioned before he avoided this problem by starting a new thread for each request. Is this really a good way?
The other way I could think of is to use asynchronous IPC with some kind of method like NickJohnson pointed out to pass memory pages between user processes. This way I could avoid storing message data inside the kernel when messages are queued at an IPC port because the data is already stored by the user process. However I am still worried because this method still requires a lot of support in the servers to be able to keep track of pending requests.
When I mentioned that I started a thread for each file system operation, that's a truth with modification. If there are 200 file system requests, 200 threads are of course not created. A maximum set of threads (depends how many CPUs you have and what is most optimum for that particular operation) are created and the remaining pending file system requests are queued. This is usually called a thread pool.
Re: Microkernel IPC with arbitrary amount of transferred dat
I was worried about creating a thread per request because it would mean a lot of allocation inside the kernel per request. Using a thread pool for this purpose makes sense however it is still a good question how you decide the optimal size of this pool.OSwhatever wrote:When I mentioned that I started a thread for each file system operation, that's a truth with modification. If there are 200 file system requests, 200 threads are of course not created. A maximum set of threads (depends how many CPUs you have and what is most optimum for that particular operation) are created and the remaining pending file system requests are queued. This is usually called a thread pool.
Do you use synchronous IPC everywhere in your project?
-
- Member
- Posts: 595
- Joined: Mon Jul 05, 2010 4:15 pm
Re: Microkernel IPC with arbitrary amount of transferred dat
I use a mix of synchronous and asynchronous as I see fit. The two variants are good for different things. If you look at QNX and L4, they only support synchronous messages in the kernel and you can come a long way with only synchronous. You can actually emulate asynchronous messages with synchronous messages.giszo wrote:I was worried about creating a thread per request because it would mean a lot of allocation inside the kernel per request. Using a thread pool for this purpose makes sense however it is still a good question how you decide the optimal size of this pool.
Do you use synchronous IPC everywhere in your project?
Asynchronous, good for buffering and queuing requests and data. Doesn't block sender at all.
Synchronous, good for sending large amount of data to the location you need it. If you look a most interfaces, you will notice that they are of synchronous send and reply type meaning that sender must wait for the response.
It's really up to you how what kind of programming model you want to use and what type is the most beneficial for you.
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: Microkernel IPC with arbitrary amount of transferred dat
If two threads are trying to access the same file descriptor at the same time, lock the region with a mutex...giszo wrote:I am in the process of designing an implementing the IPC mechanism of my hobby microkernel project and I need some help mostly regarding design considerations.
I faced the first problem when I tried to implement the communication between an usual userspace process and the VFS server because I need to pass a lot of data between them when the process wants to read or write some kind of file, etc. At first I implemented my IPC messages with fixed size (6 * 8 bytes) to avoid queueing big message data inside the kernel. I thought that this problem can be solved with the help of creating a shared memory region per (to stick with the previous example) file descriptor. This region could be used to put the data there that should be written and vice versa. I started to worry about my design when I thought about what happens when multiple threads are trying to access the same file at the same time, the simple shared memory region will not be enough.
Don't try to invent "One grand unified solution" if it doesn't make sense. As I'd do it:
- Small IPCs might go through a memory mapped ringbuffer shared between the ends of the communications channel (With the kernel providing the ability to "prod" or "wake up" the other end)
- Large IPCs happen by some form of per-IPC page mapping
If you look at L4's IPC:
- The receiver must explicitly allocate the buffers it is to receive into. If the sender attempts to "string copy" more data, or map more pages than is allowed, the IPC is denied
- The sender and receiver always get the option of not waiting for an IPC
Where you might want to differ from L4 (certainly I take a lot of inspiration from L4; this is one area where I differ) is message addressing: L4 addresses messages at thread IDs. You might want to add some notion of "IPC ports" or similar.
-
- Member
- Posts: 595
- Joined: Mon Jul 05, 2010 4:15 pm
Re: Microkernel IPC with arbitrary amount of transferred dat
The shared memory ring buffer is great as it is a zero copy solution and allocation can in some cases be done lock free but this requires that you map at least one page for each open channel. The VFS for example is likely to have many clients, in the hundreds even which consume a lot of virtual and physical address space. Each program is also likely to use several other services and not only the VFS so the amount mapped channel pages must be quite large or am I exaggerating the problem? Do you see this as a problem or is the extra memory used here worth it?Owen wrote:Don't try to invent "One grand unified solution" if it doesn't make sense. As I'd do it:
- Small IPCs might go through a memory mapped ringbuffer shared between the ends of the communications channel (With the kernel providing the ability to "prod" or "wake up" the other end)